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FOREWORD 


This issue of the Review, like corresponding preceding issues, presents 
a cross section of research methodologies used during the last triennium. 
lhe topic, sampling, is omitted because it is being covered by the report of 
the Second Annual Phi Delta Kappa Symposium, which is shortly to ap- 
pear. Chapters in it by Leslie Kish and F. G. Cornell cover sampling from 
an elementary level to more complex ones. Discussion of instrumentation 
is added in this issue because educational research has reached a stage of 
development where pencil-paper data-collection techniques are no longer 
sufficient. 

The importance of instruments in the physical sciences was indicated by 
Klopsteg (1960). Among the 138 Nobel laureates from 1901 to 1960, 
recognition was accorded to 112 for research in which instrumentation was 
a vital means. Both American Nobel prize winners in 1960 were recog- 
nized because of their contributions to instrumentation. 

Some educational phenomena are inaccessible to direct observation, and 
others occur so rapidly or so frequently that a human observer is over- 
whelmed. As a result of recent advances in instrumentation, it is now 
feasible to record certain aspects of human behavior—particularly those 
related to stress and motivation—in a manner that is more objective than 
pencil-paper techniques. Interest in mechanical instructional and testing 
devices has been revived. For educational researchers such devices ap- 
pear to be particularly helpful in providing a means of rigorous control in 
studies of instruction and learning. Virtually all applications of instru- 
mentation in educational research have been in studies of instruction and 
learning by means of mechanical devices; the chapter devoted largely to 
teaching machines is especially apropos. 

NicHotas A. Fattu, Chairman 
Committee on the Methodology 
of Educational Research 


108 








CHAPTER I 


The Role of Research in Education— 


Present and Future 


NICHOLAS A. FATTU 


ry. 

Pius CHAPTER is devoted to consideration of materials relevant to educa- 
tional research methodology, but not treated in the chapters as they are 
constituted. The discussion is framed broadly in terms of two questions: 
What is the role of research in education? and What is a desirable role for 
educational research in the future? The first describes some issues relative 
to educational research. The second discusses recent developments in the 
social sciences that have not been, but might profitably be, explored in 
educational research: model building, simulation techniques, systems an- 
alysis, mechanized learning and thinking models, information theory, and 
decision theory. 


The Role of Research in Education 


Studies of the role of research in education were concerned with dis- 
cussions of professional! status and professional responsibility: Brown 
(1900); Harris (1960) ; Flagle, Huggins, and Roy (1960) ; Goode (1958) ; 
Hunt (1956); Kidd (1959); and the first Phi Delta Kappa symposium 
(Banghart, 1960). 

It was said that research provides the foundation of professional status. 
Brown (1960) summarized the requirement of a profession for practi- 
tioners (a) who are free and responsible individuals and who can be 
depended on because of their professional integrity to establish and main- 
tain their professional standards of performance; (b) who keep a learning 
approach throughout life as a means of fulfilling their professional respon- 
sibilities through ready application of new knowledge. 

Harris (1960) urged a “coming of age” in education. Technological 
schools, he contended, by abandoning the trades-training approach and 
instituting abstract theoretical approaches, now design engineering cur- 
riculums to make extensive use of intellectual formulations and research. 
According to Harris, technology, by coping intellectually with the problems 
it faced, won increasing respect and stature, but education appears to be 
still largely an application of psychological rules of thumb. Harris as- 
serted that, to “grow up,” education must conceptualize its processes and 
develop a series of new intellectual formulations. Improved conceptualiza- 
tion was also urged by the American Council on Education (ACE) (1939) ; 
the American Educational Research Association (AERA) (1956); Brim 
(1958): Brown (1960); Coladarci (1960); Goethals (1958); McConnell, 
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Scates, and Freeman (1942); Travers (1958); Traxler (1954); and Ulich 
(1937). 

Flagle, Huggins, and Roy (1960) maintained that the professions have 
been forced to give research a larger role by the rapidly changing character 
of the world. For example, coal can be mined, iron can be smelted and 
refined, easily located petroleum can be exploited without scientific aid; 
but it is estimated that within a generation 75 percent of electrical energy 
must come from nuclear or solar sources. With unprecedented population 
increase, underdeveloped nations demand their full share of the world’s 
goods. Inevitably all phases of civilization must become more complex 
and technical and demand greater scientific sophistication. Technology has 
become intellectual and strongly oriented toward research because the 
demands of the world have forced it to. 

Not only have science and technology become more complex, but the 
rate at which changes occur has led to further problems. Johnson (1960b) 
estimated that knowledge of the physical sciences doubles every 15 years, 
and of the social and management sciences every 50 years. The latter 
increases at about the same rate as the population of the world. Gen- 
eral Electric has indicated that over 40 percent of the products it currently 
sells were not in existence 10 years ago (Suits, 1958). 

Brim (1958), Becker (1960), Hunt (1956), Kidd (1959), Traxler 
(1954) saw educational research as not keeping pace with the world. 
Recker (1960), finding an investment in American education of 24 billion 
dollars during 1960, observed serious deficiencies at all levels, and he be- 
lieved that educational resources must be used more efficiently. His opin- 
ions were shared by Keezer (1960a) and by the National Bureau of 
Economic Research report on the economics of education. Economics of 
research and education was also explored by Keezer (1960a, b), Schultz 
(1959), Shockley (1957), and Siegel (1960). The point emphasized was 
that continuing expenditure on education presupposes a continual flow of 
good ideas. Simons (1960) saw the lack of such ideas as crucial and in- 
dicative of a necessity for greater emphasis on basic research. 

The opinion that educational research has not kept pace with the world 
was widely expressed. Brim (1958) reported on deficiencies in educational 
research and proposed work to be performed by social scientists. Several 
professional organizations have expressed their concern in various ways. 
The Organization for Research in Education was established by the Na- 
tional Academy of Sciences and the National Research Council. (It was 
dissolved when the Council for Research in Education was established.) 
According to the first Phi Delta Kappa symposium (Banghart, 1960), 
more educational researchers are employed by foundations, industrial 
organizations, and agencies of the federal government than by public 
schools and universities. 

Some notable activities were directed toward increasing educational 
research: the Council on Educational Research was established through 
the efforts of the late Percival M. Symonds and his associates at AERA. 
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The Phi Delta Kappa Annual Symposium on Educational Research and 
the Big Ten Research Directors Conferences were instituted. The Center 
for Advanced Study in the Behavioral Sciences has begun to consider 
educational researchers. 

The most important boost for educational research was the establish- 
ment of the Cooperative Research Program of the U.S. Office of Education 
and the various titles within the National Defense Education Act. When 
the history of educational research is reviewed with the perspective of the 
future, these federal programs will probably stand out as the significant 
turning points in educational research. 

Unfortunately these efforts are still too little and too late. A recent 
survey reported at the first Phi Delta Kappa symposium (Fattu, 1960) 
indicated that, of the 94 colleges and universities which grant the doc- 
torate in education, only 10 could be said to be making a serious effort to 
encourage educational research by maintaining a favorable intellectual 
climate and giving adequate financial support, by making time and facili- 
ties available for faculty research, or by giving significant consideration to 
research when appointing new faculty members. It was suggested that the 
observed indifference to research might be related in part to the domina- 
tion of these institutions by practitioners who attained their positions of 
influence through literary and forensic skills rather than through contri- 
butions to and understanding of science. In terms of allocation of re- 
sources—finances and faculty time—all of the 10 most highly respected 
institutions devoted more to research than to field services; among the 
rest the emphasis was reversed. Similar findings were reported by Phillips 
(1957) and Ryans (1957). 

To summarize, more research is needed if education is to carry out its 
responsibilities in a rapidly changing world. More funds and other support 
are necessary to educational research. 

Although American public education is more efficient than at any 
earlier time (it is probably the most efficient in the world), it is not as 
effective as it can and must be to maintain the American way of life. There 
are many competent, dedicated educational researchers, but their number 
does not meet the demand. Current trends in industry and government 
suggest that other agencies are prepared to assume responsibility for 
adding new knowledge. The implications of such an outcome for education 
as a profession should be a matter of concern to all educators. 


The Nature of Educational Research 


Educational research seemed to have fluid boundaries encompassing 
virtually all phases of scholarly activity associated with the educative 
process and organization. It included carefully designed experimental 
studies of current and proposed practices; mass collections of data, such as 
surveys, not illumined by systematic conceptual guiding lines, thought of 
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as routine work; theoretical, historical, philosophical, and integrative 
scholarly activities; critical reviews of research literature and summaries 
of issues and problems; applied research focused on local practices and 
policies, planned to stimulate interest in more fundamental studies, as 
well as to develop the school staff or solve an immediate problem. 

The first Phi Delta Kappa symposium (Banghart, 1960) defined educa- 
tional research variously as ranging from routine clerical operations to 
sophisticated disciplined inquiry. Descriptions of educational research in- 
cluded a variety of activities: listings and tabulation by titles (Blackwell, 
1958; Brehaut, 1958); surveys of activities of researchers or organiza- 
tions (Phillips, 1957; Ryans, 1957; H. K. Miller, 1958; MacArthur, 1958; 
Weitz, 1957) ; discussions of the nature of educational research (American 
Council on Education, 1939; AERA, 1956; Coladarci, 1960; McConnell, 
Scates, and Freeman, 1942; Levin, 1956; Travers, 1958; Traxler, 1954; 
Ulich, 1937; Walker, 1956); discussions of a framework for educational 
research (Goethals, 1958; Tiedeman and Cogan, 1958); discussions of 
activities of scientists (Schwab, 1960; Simons, 1960; Helmer and Rescher, 
1959). 

A consideration related to the definition of educational research is im- 
plied by the question, Is there a legitimate area for educational research? 
Discussion of the question appeared in several forms, but may be summa- 
rized as follows: Education is a practice and an art. The basic findings 
come from psychology, sociology, and other social sciences. Education 
takes these findings and applies them. 

It is difficult to reconcile such a position with that observed among 
groups which currently make the most use of research—government, in- 
dustry, and medicine. These fields recognize that discovery of new knowl- 
edge is only one step in the process toward effective utilization. For ex- 
ample, knowledge required to produce nuclear fission existed before the 
Manhattan project; it took a great deal of applied research and develop- 
ment to translate it into products and processes. In fact, the recent studies 
of the research and development process by the Carnegie Institute of 
Technology indicate that it is twice as costly (in time and resources) to 
produce the product or precess as it was to make the original discovery. 

\ second relevant question is, What standards of research performance 
are self-imposed or enforced by the group? Again direct recent considera- 
tion is scarce. Lerner’s (1959) and Weiss’s (1960) comments more di- 
rectly suggest that standards of expectation might be more explicitly 
defined and enforced. About a quarter of a century ago more direct 
attention seems to have been given to this matter (McConnell, Scates, and 


Freeman, 1942: ACE, 1939). 


Desirable Amount of Research 


No studies were discovered in the field of education that gave direct 
attention to the question of how much research is desirable. The National 
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Science Foundation awarded grants to the Carnegie Institute of Tech- 
nology and the Western Reserve University to study this problem in the 
physical sciences. 

Discussions of this topic found in business publications were relatively 
frequent, probably because survival in a rapidly changing competitive 
environment demands innovation. The rule of thumb was, Don’t do any 
less than your nearest competitor. 

Secker (1960) commented on the effects of underinvestment in educa- 
tion. Noting that public and private expenditures for education run to 
many, many billions of dollars each year, he pointed out that all types of 
education offer a fertile ground for comparative productivity and input- 
output studies. 


The Distribution of Research Activity 


Research activities are classified by the National Science Foundation 
as “basic research,” “applied research,” and “development.” 

Sasic research includes original investigation for the advancement of 
scientific knowledge. The primary aim of the investigator is achievement 
of fuller knowledge or understanding of the subject matter under study, 
rather than making practical applications of new knowledge. Applied re- 
search is directed toward practical applications of scientific knowledge. 
Development is the systematic use of scientific knowledge for the produc- 
tion of useful materials, devices, systems, methods, or processes, exclusive 
of design and production engineering (Fattu, 1960). It is evident that the 
sequence from research to action is in that order. An invention of a device, 
procedure, or method cannot be made until the key, or last essential, fact 
is discovered: for example, a television set could not be produced until 
all the basic discoveries of electromagnetic radiation and synchronization 
of transmitted impulses had been made. 

Tyler, in the Phi Delta Kappa symposium (Banghart, 1960), illustrated 
the utility of basic research using research in connection with hybrid 
corn as an example. Applied research on corn and cultivation practices 
had brought relatively small increments in yield; the development of 
hybrid corn, however, produced greatly increased yield. Here the break- 
through resulted from knowledge of plant genetics rather than from culti- 
vation practices. The original discovery was made in 1908, but applica- 
tions were not made until the 1930’s when economic pressures forced 
the development. Also, hybrids must be developed or adapted to fit condi- 
tions of a region. Griliches (1957) summarized the story in detail and 
cited many related references. The example should be instructive to one 
who wishes to trace the interaction of basic research, applied research, 
and development. 

Colleges and universities claim to add to as well as to disseminate 
knowledge; hence it would seem that basic research should find a con- 
genial atmosphere within the university. The National Science Founda- 
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tion reported that, in engineering schools, 57 percent of total expenditures 
budgeted for research and development was devoted to basic research. 
In industry, funds for basic research totaled 344 million dollars, or about 
4 percent of the 9.4 billion dollars spent for research and development. 
Corresponding data for educational research are not available and would 
be meaningless at the present time. Certainly, educational research re- 
quires more applied research and development than basic research, but 
the funds available for all educational research are so much less than 
those available in other areas that the task would seem to be first raising 
the amount, before considering the distribution. 


Selection and Preparation of Educational Research Workers 


Comments on training for research were presented by the American 
Psychological Association (APA) (1959), Brim (1958), Brown (1960), 
Cronbach (1957), Goode (1958), Harris (1960), Keezer (1960b), Kidd 
(1959), Travers (1958), and Walker (1957). 

Selection of research workers was differentiated from that of practi- 
tioners. According to Cronbach (1957), Taylor (1956, 1958, 1959), and 
Thistlethwaite (1959), selection of researchers should emphasize cre- 
ativity, as well as measures of aptitude, school performance, and motiva- 
tion toward original inquiry. 

It was suggested that a high grade in undergraduate work might 
be evidence of conformity that might be undesirable in research. Under- 
graduate performance in tasks requiring creativity, originality, and in- 
tellectual nonconformity were thought of as probably being better pre- 
dictors. Motivation toward research was also considered a prime criterion 
for selection. Perseverance seemed a significant factor in scientific achieve- 
ment. (In his autobiography, Max Planck stated that for 19 years the 
exploration of the Second Law of Thermodynamics occupied every waking 
moment that he could recall. Kepler and Galileo worked more than 30 
years before they produced their formulations. Breakthroughs in science 
apparently require a high order of creativity and a concentrated effort 
sustained over a period of many years.) It seems reasonable to believe 
that the more complex the area of investigation, the more sustained effort is 
required. 

There was agreement that the training of researchers should also differ 
from that of practitioners. It was suggested by several authors, including 
Helmer and Rescher (1959), that researchers need to understand the 
strategy and tactics of science and the language of science (including 
modern mathematics) and an academic scientific area. The preparation of 
research workers in the physical sciences appears to be more demanding 
than that for social scientists. 

Agreement was almost unanimous that the best preparation for research 
is apprenticeship to a skilled researcher. The opportunity to participate 
in and carry on independent research and publication was regarded as 
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indispensable. The APA report (1959) summarized this point of view 
as follows: “Everything we have found points to the fact that course work, 
formal examination requirements, and anything else that could be stand- 
ardized concerns what is ancillary to research training. What is of the 
essence is getting the student into a research environment and having him 
do research with the criticism, advice, and encouragement of others who 
suffer the same pain and enjoy the same rewards. . . . Research is learned 
by doing and taught mainly by contagion. Research must first be going 
on if there is to be research training. What formal courses are offered is 
no index of quality of a department as regards such training; the only 
adequate index is the eventual productivity of the individuals that the 
department produces.” 

The first topic discussed here has been some issues relative to educa- 
tional research. No definitive answers were found, and at this time it 
would be premature to offer any. However, the well-being of education 


as a profession may lie in serious consideration of these and related 
topics. 


Some Recent Developments in Educational Research 


This section is a brief discussion of recent developments—operations 
research and systems analysis—that have been used profitably in a social 
science. Perhaps these methods can be explored, applied, and revised to 
help solve certain problems in education. 

Operations research is the application of mathematical and other sci- 
entific procedures and common-sense procedures to the solution of prob- 
lems encountered within an organization—specifically to co-ordinate the 
operations of the various functional units to attain the over-all objectives 
of the organization. Operations research may be defined as the applica- 
tion of scientific methods, techniques, and tools to problems involving 
operations of enterprises in order to provide optimal solutions. 

Kershaw and McKean (1959) discussed the potential for operations 
research in relation to education in general terms. A comprehensive sum- 
mary was made by Dorfman (1960). (In reading Dorfman, one should 
bear in mind that to master the mathematics is not to qualify as an opera- 
tions researcher; one learns to plan and carry out operational experiments 
by experience.) A general view can be had from Johnson (1960a) in 
conjunction with Dorfman; then Flagle, Huggins, and Roy (1960) and 
Machol (1960). A student who wishes to study the matter thoroughly 
should consult the extensive bibliographies of the Case Institute of Tech- 
nology Operations Research Group (1958) and Shubik (1960). 

Some topics of operations research potentially useful for the study 
of educational problems are mathematical model building, mechanized 
models of the learning and thinking processes, and simulation proce- 
dures. A model is thought of as an analogue. It reproduces those features 
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of the thing modeled that are significant for the purpose at hand. In some 
cases, significant features are directly observable—as with maps, ge- 
ological or topological representations, buildings, and the like. Models 
may incorporate features which show how the thing modeled responds to 
forces acting on it—models of ships, airplanes, electric generating systems, 
or atoms. 

Orcutt (1960) saw a model as a physical representation, a prose de- 
scription, an example of pictorial geometry, a mathematical statement. 
or a computer program presentation. Some concepts can be described 
and worked more easily in the language of one discipline than in that of 
another. In physical science, the optimal description appears to have 
been achieved in rigorous mathematical models. Mathematical models are 
preferred because of their precision in representing the pertinent data 
and because of the accuracy of their substantive interpretation. Mathe- 
matical models represent the basic structure of physical science. It was 
claimed that in the social sciences models have been brought to a stage 
where objective scientific method can be applied to them; both Cronbach 
(1957) and Thomson (1960) stated that mathematical formulation con- 
stitutes an aspect of science. 

Machol (1960) stated that “it is possible to describe analytically any 
human function which can be reasonably defined in objective terms,” 
and he included thinking insofar as the term is definable. Arrow, Karlin, 
and Suppes (1960) edited a symposium on mathematical models in eco- 
nomics, management science, and psychology. Bush and Estes (1959) 
presented similar models of various learning functions. 

For those unfamiliar with the field, a suggested order of reading, start- 
ing with verbal description, follows: Lachman (1960) for general discus- 
sion of models in theory construction; Latil (1957) for cybernetics: 
Cyert, Feigenbaum, and March (1959) for a comprehensive review of 
management applications; Miller, Galanter, and Pribram (1960) for a 
discussion of “Totes.”” Mathematical background can be had from Cogan 
(1959) and Karlin (1959). Without the mathematics, these methods 
cannot be used. Perhaps a team approach might make problems more 
tractable for educational researchers. 

Mechanized or programed models of learning and thinking processes 
were discussed at the verbal level by Friedberg (1958), Friedberg, Dun- 
ham, and North (1959), Gelernter and Rochester (1958), and Hovland 
(1960). Rosenblatt (1958) described the perceptron or automaton for 
perceiving and recognizing geometric shapes. Reiss (1960) discussed a 
model of neuromuscular organisms, the most frequently discussed type of 
programed model. The advantage of mechanized models is that they 
are more easily understood than mathematical models, but they retain 
the feature of requiring explicit and unambiguous analysis of the opera- 
tion. Preparation of programed models points out gaps in knowledge 
and also provides incentive and means for filling the gaps. Machol (1960) 
believed that enterprising students might develop a programed model 


£16 














December 1960 Tue Rove or RESEARCH IN EDUCATION 





of the instructional cycle for a variety of subject matters and educational 
goals. Development of such a model would clarify what is meant by such 
terms as “method,” “goals,” and “teaching.” 

One of the most interesting operations-research methods is simulation. 
Conway, Johnson, and Maxwell (1959), Orcutt (1960), and Shubik 
(1960) provide a good introduction to the study. Simulation is the opera- 
tion of a model or simulator. The model is amenable to manipulations 
which would be impracticable or too expensive to perform on the entity 
represented; training jet aircraft pilots on Link Trainers is an example. 
The model can be studied, and, from it, properties of the behavior of 
an actual system inferred—an aircraft model in a wind tunnel or the 
hydraulic model of an economic system. 

The most interesting simulations are those done by an electronic com- 
puter. The machine is told in general terms how a certain phenomenon 
takes place and is programed to run through the appropriate events many 
times under varying circumstances and to give a summary of what hap- 
pened. How this is done can be seen in the instance of traffic-flow plan- 
ning. The way traffic lights are controlled in a large city is often hap- 
hazard. The mathematical problems in optimizing the setting of lights 
are beyond present human capabilities. No matter. Let the traffic com- 
mission’s plan be programed on a computing machine, and let several 
thousand programed “cars” loose through the “city,” and see how long 
it takes them on the average to reach their destinations. Then try other 
programs, and make adjustments until the flow of traffic is improved. 

The method is not basically different from present-day planning, which 
is also based on trial and error, but what would take years to observe, 
in actuality, takes hours in simulation. It is not surprising that simulation 
is most frequently used in gaming designed to give decision makers, in 
a matter of hours, years of experience in such matters as developing pro- 
duction schedules and buying and selling stocks. In educational research, 
the study of players and the opportunity to test hypotheses about the 
behavior of individuals and/or decision systems is possible. 

This introduction has attempted to point up some issues related to 
educational research and to suggest methods that have proved useful in 
the social sciences. For educational research to advance from a verbal 
description of educational phenomena to more precise formulations will 
necessitate that researchers have a basic knowledge of modern mathe- 
matics and computers; mastery of these fields, as well as statistics, will 
probably have to be required for the advanced degrees that certify com- 
petence in educational research. 
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CHAPTER Il 


The Philosophy of Science in Educational Research 


MICHAEL SCRIVEN 


T us carrer endeavors first to indicate where problems in educational 
research arising out of the philosophy of science are crucial. It also identi- 
fies those sources which provide a background in philosophy of science 
for workers in the “sensitive” areas. Particular attention is given to work 
of the last three years, inasmuch as this subject was discussed by Brodbeck 
in the Review in 1957. Here, however, a somewhat different perspective on 
the topics discussed by Brodbeck is offered. 


Areas of Relevance 


Evaluation 


A professional logician observing educational research of the last three 
years is at once struck by certain similarities which exist between it and 
other fields of contemporary research. An analogy between research in 
educational areas and research in psychotherapeutic areas holds with 
respect to the difficulty of constructing valid experimental designs. Both 
present a number of crucial variables, consequent difficulty of performing 
statistical analysis, and elusiveness of reliable measures for those variables. 
Both fields involve value judgments, even moral value judgments, in a 
number of experimental designs and in areas where experimental investi- 
gation is highly desirable. 

The moral issues arise in connection with problems of manipulating 
the subjects appropriately in order to obtain valid experiments. More 
importantly, however, they arise with respect to the interpretation of the 
results as a basis for action within the profession itself. Examples are 
work on the “gifted” student, “adequate representation” of lower economic 
classes in the parentage of the high-school groups, the “appropriateness” 
of counseling and guidance procedures, the evaluation of colleges and high 
schools on a comparative or an absolute basis, the construction of effective 
disciplinary procedures, the introduction of automatic teaching machines, 
the “obligation” of the states or the federal government to finance or 
desegregate education, the separation of superior students into different 
sections and the associated acceleration procedures, and the interpreta- 
tion of “creativity.” These examples appeared on a survey of the Review 
of the last few years. In all these cases, the authors of the chapters in 
question are consciously or unconsciously committed to moral value 
judgments, whichever interpretation they make of the relevant phrases. 

One can, of course, give a purely operational, non-value-impregnated 
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definition of the term superior college, for example, perhaps by giving a 
simple index involving the proportion of its graduates who proceed to 
post-graduate work, who become highly thought of in their communities, 
or whose names appear in Who’s Who in America. To do this is simply 
to postpone the uncovering of the assumption by one step. Such an index 
can be justified only by an analysis of the desirability of such achieve- 
ments by the graduates. Those achievements must be assessed as being 
not merely the desired goals of the average college president, college 
faculty member, or college student, but also as being for the good of the 
community as a whole. Making such assessments means, essentially, 
making moral value judgments of these attainments. 

Despite the obviousness of this point, much research continues which 
employs criteria that would not survive five minutes of critical explicit 
discussion. The explanation is simple. A strong tradition in the history 
of psychology separates empiricism from ethics, and the average re- 
searcher feels completely insecure when he discovers that his criteria 
involve ethical variables. Either he does not allow himself to perceive this 
fact, or, if he does perceive it, he says nothing about it. 

He may, of course, turn his attention to variables which do not involve 
ethical components. In educational research this recourse rules out the 
most interesting problems of all, some of which have been mentioned. 
The philosopher of science has a role to play in helping the educational 
researcher with this dilemma, and in recent years extraordinary progress 
has been made toward development of a rational foundation for ethical 
judgment. Much of this work is yet unpublished, but some of it is referred 
to here (Baier, 1958; Brandt, 1959; Edel and Edel, 1959). ' 

Thus it might be said that the defects in the utilitarian position, which 
have for so long encouraged research into nonrational ethics (the emo- 
tivist theory, for example), are now patched up, and it is possible to 
give a satisfactory, consistent, and non-question-begging utilitarian ethic. 
To disagree with the assertion just made is to disagree at a level and on 
grounds which provide little consolation for those who insist on the 
necessity for theological axioms in ethics. One of the most profound con- 
tributions which the philosophy of science has to make to educational 
research lies in the objectification of value judgments, especially social 
value judgments, that is, moral value judgments (on the utilitarian 
assumption). 

Descriptive research in education is only part of the story. The most 
interesting results in the field, from the point of view of social action, 
concern causal analysis. When problems of causation in the social sciences 
are met with, difficulties of a kind quite unlike those in the physical 
sciences are encountered. Recent years have seen, in the philosophy of 
science, the development of an acute awareness of the important differ- 
ences between the physical and the social sciences apart from that to 
which our attention has been called so frequently, namely, the involve- 
ment, in certain subareas of the social sciences, of value judgments. 
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Explanation 


The most interesting work in connection with this discrimination be- 
tween the physical and social sciences has occurred in the philosophy of 
history. In several recent articles, the most important of which were 
collected along with background readings in Gardiner’s (1959) anthology, 
a group of philosophers of science elaborated the differences between 
physical and humanistic explanations. The ways in which this discussion 
of historical explanation can be transposed to the field of educational re- 
search have not yet been spelled out. Even to transpose them into the 
general area of scientific psychology is an important task still awaiting 
the attention of logicians. 

Certain striking points, however, can be made. In the first place, the 
idea of explanation as deduction from true generalizations no longer 
holds (Dray, 1957). (This conclusion is not shared by the present re- 
viewer's predecessor.) The abandonment of this view of scientific explana- 
tion is due partly to a realization that it is not attained by most physical 
explanations, and partly to a realization that the explanations, in history 
and elsewhere, of behavior of human beings are highly informative and, 
in a certain sense, fully complete, although these explanations are not 
deductions from true generalizations. 

This sounds like a logician’s squabble; but, to give one example of its 
significance for the social sciences, it follows from this and certain other 
considerations of a fairly acceptable kind, that the entire Hullian tradition 
of searching for mathematico-deductive theories of human behavior is a 
waste of time. (This is not to say that it was a waste of time when it was 
first done.) If the present reviewer reads the signs aright, there will 
inevitably be, in the ensuing decades, a concentration on local rather than 
global theories of behavior and an emphasis on work using our present 
conceptual terminology rather than on introduction of new jargons. 

It is not irrelevant to a consideration of whether this is a fair prophecy 
to note in the last few years, within the field of psychology, an increas- 
ing acceptance of the criticisms of the Hullian and post-Hullian at- 
tempts at systematic theories of behavior put forward by Koch (1954) 
and others. Thus great importance must be attached to recognizing that 
the search for adequate causal analyses of human behavior does not lead 
inevitably, or even appropriately, to the development of axiomatic super- 
theories. 


Causation 


The belief of Russell (1953) and others that the use of causation in 
science is a sign of immaturity was widely accepted among traditional 
philosophers of science of the period 1925 to 1955. Where use of causa- 
tion was found, it was considered a crutch with which the subject could 
limp on to better days. It now seems clear that the role of causal analysis, 
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although indeed minimal in such areas as theoretical physics in which 
exhaustive and effective mathematical laws are available, is indispensable 
both in the application of advanced sciences and, independently, in a 
formulation of the knowledge of the less theoretical sciences. Moreover, 
cogent reasons exist for supposing that there are certain sciences—among 
them large parts of the social sciences including parts of the educational 
field—-where no expectation whatsoever of eventual development of 
abstract theories is appropriate. Hence, there is every reason to expect 
that large and respectable parts of science will continue to employ causal 
claims rather than precise systematized laws. 

Naturally, this leads one’s attention to a more careful analysis of the 
concept of cause. Fundamentally, a cause is a miniature explanation—not 
an incomplete explanation, but a small explanation. Particularly, it must 
not be taken to be the same as a sufficient condition, or a necessary 
condition, or as committing its employer to a belief in determinism. 
Cause is an identifying or selecting or focusing or differentiating notion, 
which operates somewhat as a premise in the analysis of deductive 
arguments. It can be understood only in the context of a particular inquiry, 
where the contrasts that it is used to educe can be understood; from a 
formal point of view, any one of 40 variables may be in the same position 
as far as a particular effect goes, but in the context of a particular inquiry 
one, and only one, of these may properly be called the cause. (It is thus 
a notion from pragmatics, rather than syntactics, to give it a proper place 
in the over-all field of logic.) 

The empirical elements involved in isolating the candidates for a 
causal assertion still raise important problems of experimental design. 
How is a distinction to be made between a causal connection and a 
mere correlation? Brodbeck (1957), following Braithwaite (1953), pro- 
posed that the distinction lies in the answer to the question of whether 
the alleged connection can be deduced from some other law or laws: if 
it can, it is causal; if it cannot, it is a mere correlation. This is too 
simple, unfortunately. The problem still remains of whether the laws 
from which it is deduced are themselves causal laws or merely correla- 
tional laws. A complete answer requires a study of the role of the connec- 
tion in those theories, usually of a very tentative kind, which could be 
said to provide an explanation of them. 

Experimentally, the problem does not require the sophisticated analy- 
sis demanded by the philosopher. Nevertheless, it presents some intriguing 
difficulties. Suppose a certain treatment is applied to the experimental 
group, for example, intensive tutorial assistance, and that a perfectly 
matched control group ultimately shows itself to have attained equivalent 
improvement over a certain interval. This result is normally taken to 
demonstrate an absence of causal efficacy on the part of the experimental 
variable. It does not. It may well be that the experimental variable does 
produce significant improvement, but the described design (despite its 
Utopian assumption of perfectly matched controls) does not prove the 
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fact (Hook, 1959). This is a practical experimental result which arises 
from the logician’s investigations. Similar practical consequences are 
found when applied medicine is turned to and the current status of the 
placebo studies is examined; it is not realized that a single control study 
cannot demonstrate any placebo effect. 


Evidence 


A great revolution in social science has been taking place, particularly 
throughout the last decade or two. Many educational researchers are 
inadequately trained either to recognize it or to implement it. It is the 
revolution in the concept of evidence. The problems that are faced in 
experimental design in the social sciences are quite unlike those of the 
physical sciences. Problems of experimental design have had to be solved 
in the actual conduct of social-science research; now their solutions have 
to be formalized more efficiently and taught more efficiently. Looking 
through issues of the Review or EpucaTIONAL RESEARCH, one is struck 
time and again by the complete failure of the authors to recognize the 
simplest points about scientific evidence in a statistical field. The fact 
that 85-percent of National Merit Scholars come from small families 
and that over 70 percent are first-born is quoted as if it means something, 
without figures for the over-all population proportion in small families 
and the over-all population proportion that is first-born. 

The simple fact is this: by minimum acceptable research standards, 95 
percent of the work in the field of psychotherapy that is concerned with 
causal analysis is, by either theoretical or practical standards, invalid 
or trivial. In educational research the situation is no different. So far as 
descriptive work goes, the situation is better; but this is less interesting 
(Hook, 1959). One cannot apply anything one learns from descriptive 
research to the construction of theories or to the improvement of educa- 
tion without having some causal data with which to implement it. There 
is no need for educational researchers to feel inferior because of this 
situation, but they should feel dissatisfied. 

Corresponding to this persistent lack of sensitivity to minimum stand- 
ards of good evidence in a multivariable field, there is the persistent 
failure to face up to the problems arising from the fact that the applica- 
tion of educational theories has morally significant consequences. In 
guidance and counseling, for example, which are no different in this 
respect from research into the education of the gifted or other fields that 
could be cited, two senior editors are found agreeing that “authors make 
many philosophical assumptions both explicit and implicit but usually 
neither examine nor test them” (Wilkins and Perlmutter, 1960). 

From the logician’s point of view, then, gross deficiencies of self- 
awareness in educational research exist, although techniques are avail- 
able for handling most of these difficulties. As long as those in education 
allow their own institutions to put out written and cinematographic 
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propaganda which seeks support for higher education by arguing that 
the average income of graduates is so much higher than that of non- 
graduates as to more than reimburse them for the cost of higher educa- 
tion within very few years (without adducing any grounds whatsoever 
for supposing that this connection is in fact a causal connection and is 
not, for example, due to the higher income group of the families from 
which college students come)—so long will they fall short of achieving 
maturity for their own subject. This is an excellent example of an argu- 
ment which is scientifically unsound and significantly immoral, since it 


encourages people to spend money on the basis of a belief which is not 
known to be well founded. 


Sundry Issues 


The confusion about what constitutes an adequate definition persists, 
and has continued to be discussed during the last three years (Feigl, 
Scriven, and Maxwell, 1958). As in the case of explanation, important 
advances seem imminent. It has been realized that the significant terms 
of theoretical physics are not amenable to explicit definition, or indeed 
to definition in any precise and condensed way. With this collapse of 
the idol around which most of the theology of operationism and reduction 
sentences was built, there has come a more realistic approach to definition. 
As Mandler and Kessen (1959), in a most encouraging book, have 
stressed, there is only one important standard for good definitions, and 
that is inter-user reliability in their use in a given verbal or empirical 
context. That is, the important procedure in the introduction of a new 
term is provision for adequate training in its use for the reader. 

Typically, such training can be provided by giving many examples 
and some loose rules to serve as guidelines for the term’s use. But the 
word loose here must not be misunderstood. A good definition, that is, 
a good explanation of the meaning of a term, gives extremely high 
reliability in its use. Whenever this can be done by explicit simple 
definitions, then it should; with the introduction of new terms this is 
usually possible. But it should not be dismaying to discover that some 
theoretical concepts, new and old, have acquired too great a burden of 
meaning for any explicit definition to encompass. In those cases it must 
not be supposed that the use of a single example (implicit definition) or 
a rough analogy will be adequate. If the introduction of the new term 
is to be justified, rather than the use of a concatenation of old ones, then 
it must be done properly, and this is a lengthy business. 

To the logician it is clear that in educational research, as in the social 
sciences generally, there is still a pathetic tendency to identify the use of 
a jargon with the possession of a science. Terms such as consonance and 
dissonance in social psychology, model, meaningful, intellective, norma- 
tive, methods, scale, role, motivation, cross-cultural, and action research 
are still used (in the special senses which are relevant to educational 
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research) in sloppy, unilluminating, and irresponsible ways. It could 
almost be said that, outside of statistics, terms which have been intro- 
duced specifically for educational research have done more to confuse 
than to clarify. That such a cynical generalization should have validity 
ought to make those concerned think three times before introducing new 
terms or new senses of old terms. 

Another area where logical analysis is appropriate is discussion of 
objectivity, prejudice, bias, and similar concepts (Gardiner, 1959). 
There is still a pervasive tendency to suppose that the existence of a 
causal explanation for everybody’s beliefs means that there is not a 
rationally superior justification for some of those beliefs. This is the old 
fallacy of the sociology of knowledge, and its ghost should have been 
long since laid (Hampshire, 1959; Hook, 1958). 

Discussion of brainwashing, subliminal perception, and motivation 
research in advertising psychology and psychopathology has important 
consequences for the thoughtful student of education. What distinguishes 
brainwashing from education? What is indoctrination? What is propa- 
ganda? To what extent are educators in fact supporting this kind of 
influencing procedure in their school system with ritual observance of 
allegiance, emphasis on peer-group attitudes as a criterion for social 
action, and the like? Analytical thinking on this kind of subject is still 
badly needed (Kinkead, 1959). 

Finally, careful investigation of the possibility and success of separate 
training in courses in logic, scientific method, critical thinking, and 
investigation of the extent to which such training transfers or generalizes 
to other fields is needed. Somehow it must be ensured that at a much 
earlier stage in their development, students become self-consciously aware 
of the process of education and its presuppositions and justifications, so 
that they will eventually be in a position to improve it in the many ways it 
stands in sore need of improvement. 





General References 


Many of the topics discussed here, and certain others of interest to 
the researcher (for example, logic of discovery), are discussed in com- 
pendious books that appeared during the last three-year period. Refer- 
ence to these will give an interested worker a picture of the present 
range of relevant thought in the philosophy of science (Gibson, 1960; 
Hanson, 1958; Klibansky, 1958; Nagel, 1960; Popper, 1959). 
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CHAPTER III 


Research Methods: Experimental Design and Analysis 
RAYMOND 0. COLLIER, JR. and DONALD L. MEYER 


. 
F ottowine the pattern set by Stanley (1957), this chapter omits almost 
all the references which have been covered by Harman (1958), Grant 
(1959), and Kogan (1960). Writings which have relevance to or are 
potentially useful for educational experimentation from either a long- 
range or a short-range point of view are, in general, noticed. In certain 
areas of experimental design and analysis, only a representative number 
of the many papers which actually appeared have been considered. 


Design and Analysis of Experiments 


Most of the work relative to the design and analysis of experimental 
results during the last three years has added to and extended standard 
designs and analyses. 


Randomized Blocks, Latin Squares, and Split-Plots 


The frequently employed randomized-block design was considered by 
Sampford and Taylor (1959) for the experimental condition where it is 
known only that a particular subject’s response is greater or less than 
some value. Treatment effects and bias were estimated, and modified 
T-tests were derived for testing treatment differences. 

Mandel (1959) described a method valid for analysis of the Latin- 
square design under the presence of row-column interaction when the 
interaction can be represented by a simple multiplicative constant. Under 
certain experimental circumstances this technique might act to counter 
some of the behavioral scientists’ past objections to the use of Latin- 
square designs. The modified Latin square was analyzed by means of 
randomization theory by Rojas and White (1957). For this rarely used 
design they obtained the expected mean squares under randomization 
and found the F-ratio for treatments to be biased, but with a relatively 
small magnitude. 

The simple split-plot design was studied by Curnow (1957) as used 
both in a randomized block and in a Graeco-Latin square. For the special 
case of two split-plot units with possibly unequal error variances, he 
obtained a test of equality of, and confidence bands for estimating the 
ratio of, the two split-plot error variances. 

A publication which has not received the attention it deserves is Wilk 
and Kempthorne’s (1956) report on the derived-model and randomiza- 
tion theory. They provided randomization analysis of many standard 
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designs and included expected mean squares (needed in the specification 
of proper error terms) for various effects in many models under general 
schemes of sampling and randomization of treatment to experimental unit. 
Their summary of the role of randomization in experimentation should 
be of use and interest to many researchers who desire an intensive 
treatment of this problem. 


Designs in Which Treatments Are Applied to Subjects in Sequence 


An approach to repeated-measurements experiments, in which more 
than one treatment is applied to experimental subjects over a period of 
time, was given by Geisser (1959). His analysis took into account the 
presence of dependencies among observations on the same subject and 
provided estimation of treatment effects and a test of significance, a 
multivariate T*-test, of treatment effects. Related problems were discussed 
by Freeman (1957), whose paper dealt with the specification of designs 
useful in situations where experimental units previously treated in an 
experiment are employed in a new experiment. He supplied analyses of 
variance and covariance and gave variances of treatment differences. 

The involved task of constructing designs balanced for order effects 
in repeated-measurements experiments was examined by Bradley (1958). 
If an even number of treatments are being compared, it is possible to 
construct a Latin square in which each treatment is preceded by a different 
treatment in every row (and column if desired). These configurations have 
been found useful in counterbalancing immediate sequential or other 
order effects, and Bradley gave simple construction procedures for these 
designs. 

Sampford (1957) offered methods for building and analyzing designs 
in which estimation and test of direct effects of treatment in the period 
applied, and also the residual effects on treatments in the following 
periods, were desired. For treatments applied in sequence to the same 
subjects, Sampford provided designs in which the residual effect of any 
treatment appears the same number of times either with each direct 
effect including itself or with each of the direct effects not including itself. 


Factorial Experiments 


The last few years have seen increased prevalence of experiments in- 
volving factorial arrangement of treatments. Experimenters include many 
factors, believing they are then studying a process similar to that which 
they meet in nature. This, of course, is a step away from the one-factor- 
at-a-time experiment. Many experiments involving several factors were 
reported, for this seems to be a very useful design in educational 
research. 

A comprehensive listing of plans for full and fractional replication of 
investigations with up to 256 treatment combinations was provided by 
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Mitton and Morgan (1959). Dykstra (1959) considered two-level fac- 
torials and fractional factorials in which a subset of the treatment com- 
binations was replicated in order to secure an unbiased estimate of error. 
Birnbaum (1959) provided methods for judging which of certain con- 
trasts may be different from zero in factorial experiments performed with- 
out replication. Schwarz (1960) discussed a class of factorial designs for 
V observations which are distributed over the cells so that the cell 
frequencies are unequal but the resulting normal equations are explicitly 
solvable. 

The problem of estimating effects, for example, linear and quadratic 
effects, in a single-factor experiment when the levels of the factor are 
unequally spaced was dealt with by Robson (1959), who presented a 
simple method of constructing orthogonal polynomials, with numerical 
examples. McHugh (1958) discussed Hartley’s procedure for testing several 
effects in the analysis of variance for factorial experiments. This technique 
involves adjusting the significance levels for testing each effect so that an 
over-all error rate is not exceeded for all effects. The problems associated 
with the practice of using a single mean square in testing many effects 
are considered, and one solution is offered. 


Designs Useful in Investigating the Nature 
and Maximum Values of a Pattern of Response 


Of several articles dealing with response surface methodology, the most 
definitive was that by Box and Draper (1959). Considering the mini- 
mization of the integrated mean square error over some experimental 
region as a basis for the selection of response surface design, they 
showed that this criterion leads to two separate sets of terms: one involv- 
ing the variance of the estimated response; the other, the specification 
bias. The optimum design which minimizes both variance and bias was 
found to be nearly the same as that given by minimizing bias alone. 

Bose and Draper (1959) offered a special group transformation which 
leads to infinite classes of second-order rotatable designs in both two and 
three dimensions. Draper (1960) extended this transformation to second- 
order rotatable designs in four or more dimensions. Gardiner, Grandage, 
and Hader (1959) proposed several designs for exploring response sur- 
faces when the assumed model is of third order. 

DeBaun (1959) observed that the usual second-order designs require 
at least five levels of each factor and considered ; three-factor designs 
which require only three levels of each factor. None of these techniques 
and designs has been used to any extent in educational experimentation 
although there are areas such as learning research where their application 
might be efficient. 

Nonlinear models were studied by Box and Lucas (1959). Their prob- 
lem consisted of assuming a response to be a nonlinear function of either 
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parameters or variables, and then searching for a pattern of administering 
treatment combinations so as to allow a precise estimation of the param- 
eters. Applying their approach to educational and psychological experi- 
mentation, for example, one might look for a schedule of time points so 
that observations taken at these points would permit efficient estimation 
of learning or growth parameters. 


Miscellaneous Articles 


A few articles not properly classified under the foregoing headings are 
reviewed in this section. 

An important article by Chernoff (1959) contained a thorough, theo- 
retical discussion of sequential experimentation. ‘Since experimenters 
seldom perform experiments as single, isolated investigations, but rather 
as links in a chain of research and theory, there should be ready applica- 
tions of sequential designs, and this particular paper is welcomed as a 
forerunner of applications to come. 

Bechhofer (1960) developed a multiplicative model for factorial ex- 
periments where the variance of a variable is under study, giving analyses 
for testing hypotheses concerning variances. One can think ¢f many educa- 
tional and psychological investigations where the variance itself is an 
important variable. Designs which adjust for time trends or changes in a 
process over time and in which both qualitative and quantitative variables 
may be studied were presented by Hill (1960). The complexities associated 
with missing or mixed-up observations were discussed by Kramer and 
Glass (1960) for the Latin-square design and by Biggers (1959) for 
several designs. 

Bradley and Schumann (1957) and Schumann and Bradley (1957, 
1959) gave the underlying theory for comparing the sensitivities of two 
similar experiments, using noncentral variance ratios in both Model I and 
Model II of the analysis of variance, and discussed its application. The 
comparison of experiments with different scales of measurement was also 
discussed. 

Inasmuch as most articles on incomplete block designs dealt with 
methods of constructing classes of partially balanced and balanced designs, 
they were considered to be of limited general interest and are not in- 
cluded here. 


The Analysis of Variance 


Scheffé’s work on the analysis of variance has already been mentioned. 
A less extensive survey of analysis-of-variance models, their construction, 
and their differential aspects was that of Plackett (1960), who gave 
particular attention to the finite models of Kempthorne, Wilk, Tukey, and 
Cornfield. 
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Using matrix methods, Roy and Gnanadesikan (1959a, b) presented 
a unified general treatment for Model I and Model II in the analysis of vari- 
ance both for the univariate and the multivariate case. Another comprehen- 
sive paper with a nontechnical approach to problems in the analysis of 
variance was Green and Tukey’s (1960). Bankier (1960) proposed an 
operational method for obtaining the expected mean squares in the analysis 
of variance and the variances of estimates of variance components for an 
r-way classification. 

The components-of-variance model was also considered by Bankier and 
Walpole (1957) for two-way crossed and nested classifications with pro- 
portional subclass frequencies. Useful expected values for various sums of 
squares were obtained for a variety of models. Likewise, Searle (1958) 
studied the two-way classification components-of-variance setup. For the 
unequal frequencies case, he derived the sampling variance of estimates 
of the components. 

In some analysis-of-variance settings, the error components in the 
underlying model must be assumed to be correlated and to have unequal 
variances. This problem was treated by several writers, many of whom 
were motivated by repeated-measurement studies. Two such papers are 
informative. 

Extending Box’s original results on the two-way to the r-way classifica- 
tion, Bhat (1959) obtained distributions for various sums of squares 
under the assumption that the error components were correlated with 
heterogeneous variances. His results are of particular interest to re- 
searchers for whom the assumption of independent errors is untenable, 
for example, in the profile analysis problem and situations where the 
subject is measured under several conditions. 

With different assumptions, Rao (1959) derived estimation and test 
procedures for various parameters in general linear models. He investi- 
gated models in which the observations are assumed to have the multi- 
variate normal distribution with an arbitrary unknown variance and 
correlation structure estimable from the data. Rao’s results, although 
difficult to apply, make it unnecessary to assume a patterned structure of 
correlations as is often done, and the repeated-measurement problem is 
thus given more generality. 

With more emphasis on application, Matern (1957) offered a method 
for obtaining degrees of freedom through a linear combination of the 
number of squared terms in each component of the sum of squares. 

With reference to tests of hypotheses in the analysis of variance, Sut- 
cliffe (1958) concluded that random errors of measurement decrease the 
sensitivity of the F-test of difference among means. Again on tests of hy- 
potheses, Collier (1959) showed that the test of a main effect, e.g., rows, for 
a two-way classification with interaction in a reparameterized model is 
equivalent to testing the hypothesis that the average of cell parameters 
for a row is constant for all rows. 

Among investigations of the effects of assumptions in the analysis of 
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variance, Hack’s (1958) paper was of considerable interest. He obtained 
empirical randomization distributions for row and column F-ratios in a 
completely randomized two-way layout with one observation per cell. Hack 
considered two configurations—one showing little deviation from normality 
and a second more markedly deviating from normality. He obtained 100 
random permutations of the observations and compared the upper and 
lower 5-percent and 10-percent empirical F-points with those of Snedecor’s 
F-distribution. Although agreement was close for the near-normal case, the 
theoretical F-points for the second case would be underestimates of the 
true permutation probabilities. Johnson (1958) followed with a theo- 
retical discussion of Hack’s investigation. 

Srivastava (1959) studied the effects of non-normality on the non-null 
distribution of the F-statistic in an equal-frequency, one-way classification. 
Within the limitation of his specification of non-normality, he found that 
skewness had little effect on the power of the analysis-of-variance F-test, 
but that extreme deviations in Kurtosis affected the power function in a 
variable fashion, particularly with small samples. 


The Analysis of Covariance 


Under certain conditions, the analysis of covariance is a highly useful 
and effective tool in the interpretation of experimental results. The con- 
ditions under which it can be efficiently used were considered at length 
in the September 1957 Biometrics. Most of the papers included were 
reviewed by Grant (1959) and are not considered here. Unreviewed papers 
by Zelen (1957), Federer (1957), and Wilkinson (1957) were con- 
cerned with covariance in incomplete block designs, in unbalanced 
classifications, and as related to the incomplete-data problem. 

Experimenters who have decried the practice of matching groups on one 
or more variables as a substitute for obtaining random-treatment groups 
will be buoyed by the results of Finney (1957). He concluded that: (a) the 
objective matching of groups is practically impossible, (b) the arrange- 
ment resulting from a matching procedure hardly qualifies as a random 
arrangement of units, and (c) the practice leads to a biased F-ratio. The 
use of a covariance analysis with matching techniques is able to effect 
little gain in precision. 


Nonparametric Techniques in Experimental Design 


Although many articles dealt with nonparametric techniques, we con- 
sider here only those directly relevant to experimental design. 

A theoretical article by Walsh (1959) suggested a class of nonpara- 
metric procedures for testing the statistical identity of treatments in 
randomized blocks. Siegel and Tukey (1960) offered a nonparametric test 
for testing the null hypothesis that two samples come from the same 
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population against the alternative that the samples are from populations 
differing only in variability. 

A rank-sum test for comparing each of several treatments against a 
control in an experiment was proposed by Steel (1959), who also later 
(1960) advanced a rank-sum test for comparing all pairs of treatments in 
a one-way classification with equal numbers of observations in each treat- 
ment. Van Elteren and Noether (1959) obtained the asymptotic efficiency 
of the test statistic underlying Durbin’s rank analysis for the incomplete 
block design as compared to the analogous F-test in normal theory. 

Using theory developed by Roy and Mitra, Hoyt, Krishnaiah, and Tor- 
rance (1959) gave the analysis for several hypotheses of interest in a 
four-way contingency table. Numerical examples were given and exten- 
sions to higher-order tables were indicated. 


Some Current Thought in Experimental Design 


Four basic works present the most incisive and progressive current 
views on experimental design and perhaps point the direction of future 
endeavors. 

On the one hand, Gridgeman’s (1959) re-examination of the problems 
surrounding Fisher’s tea-testing lady will assure experimenters that the 
“old” controversies in interpreting experimental results have not been 
resolved. On the other hand, Kiefer, in two challenging papers (1958, 1959) 
departing from traditional views, compared the optimality properties of 
classes of designs, such as the Latin-square or balanced incomplete block 
designs chosen randomly or nonrandomly from these classes. For the 
designs considered, he concluded that—depending on the objectives of 
the experimenter, e.g., estimation or hypothesis testing—the symmetrical 
classical, randomized designs may be nonoptimal, and that nonclassical, 
nonsymmetrical designs may be optimal. The argument rests, of course, 
on definitions of optimality (Kiefer presents several), and there seems to 
be little agreement on this point among either statisticians or experi- 
menters. The development of Kiefer’s contributions should be of interest 
to many researchers. 

A long-awaited presentation of a popular technique was offered by 
Scheffé’s (1959) discussion of the theoretical and practical aspects of the 
analysis of variance, which gave extensive exposition of the various models 
and analyses used in the interpretation of experimental and survey results. 
He included finite and infinite models based on fixed, random, or mixed 
components and independent and dependent components. Scheffé’s whole 
approach was one of rigorous exposition of a method of analysis which 
has had great utility. 

These works reflect interest in examining the philosophy and structure 
of the design and analysis of experiments. As healthy as such interest is, 
it will cause the experimenter in time to alter his approach to experi- 
mentation. 


136 








December 1960 EXPERIMENTAL DESIGN AND ANALYSIS 


Concluding Remarks 


The foregoing discussion shows an abundance of articles and other 
writings on experimental design. Among books dealing with experimental 
design, the analysis of variance, and related topics were those of Chew 
(1958), Cox (1958), Finney (1960), Freund, Livermore, and Miller 
(1960), Haggard (1958), Li (1959), Maxwell (1958), Ray (1960), 
Scheffé (1959), and Williams (1959). Several of these books were 
reviewed in various journals, and references to reviews are given in the 
bibliography. 

As is always the case, many articles could not be included. A list of the 
omitted articles may be obtained from the authors. 
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CHAPTER IV 


Research Tools: Statistical Methods 


WILLIAM B. MICHAEL and STEVE HUNKA* 


Tuar THE RATE of growth of statistical methodology is a positively accel- 
erated phenomenon would seem to be true when one compares the amount 
of published material during the last three years to the quantity appearing 
during each of the several preceding three-year intervals. More than 1200 
references were located, and about 450 have been included, whereas 216 
references were noted in the corresponding chapter of the December 1957 
Review by Michael, Kaiser, and Clark. . 

Their pattern of organization and coverage is followed here. Statistical 
methods especially applicable to test construction, analysis, and evalua- 
tion are deferred for a future issue on educational and psychological 
testing. The period covered is essentially that between July 1957 and July 
1960. 

The chapter is organized as follows: after a review of recent books, 
attention is devoted to (a) general developments in statistical theory with 
particular stress on contributions to statistical inferences involving para- 
metric procedures; (b) recent advances in the theory and application of 
chi-square and contingency tables; (c) published research concerning the 
binomial, Poisson, and multinomial distributions; (d) innovations and 
modifications in nonparametric theory and techniques; (e) developments 
in regression and correlation theory, including curve fitting; and (f) 
methodological advances in factor analysis. 

The reader is urged to consult other chapters—especially the one on 
experimental design—to complete his coverage of other statistical areas 
such as analysis of variance and data-processing techniques. The excellent 
critiques of research in statistical methodology by Harman (1958), Grant 


(1959), and Kogan (1960) in the Annual Review of Psychology should 
not be overlooked. 


Books 


Scores of books on statistical methodology and experimental design 
appeared. Kendall and Buckland’s (1957) new dictionary of statistical 
terms, prepared under the auspices of UNESCO, is an indispensable 
reference aid. For most researchers in education and psychology, books 
written by behavioral scientists will be the most helpful. Among note- 
worthy introductory texts are those by Blommers and Lindquist (1960), 
Diamond (1959), Downie and Heath (1959), Ferguson (1959), Johnson 


*The writers are indebted to Cherry Ann Clark for bibliographic assistance in the early stages of 


preparation of the manuscript. The junior author is primarily responsible for the portion on factor 
analysis, and the senior author for the other sections. 
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and Jackson (1959), Mack (1960), Senders (1958), Snedecor (1960), 
and Walker and Lev (1958). Three lucid statistically oriented books 
in experimental design are Edwards’s (1960) revised text and two new 
books, by Maxwell (1958) and Ray (1960). Not to be overlooked is a 
readily comprehended book in nonparametric and shortcut statistics by 
Tate and Clelland (1957). 

At a somewhat more advanced level, but intended for the behavioral 
scientist, are two general books in quantitative methods: that by Lewis 
(1960) and a collection of papers from a 1959 Stanford University sym- 
posium edited by Arrow, Karlin, and Suppes (1959). At the same level 
of sophistication are books in multivariate and correlational analysis by 
DuBois (1957), Ezekiel and Fox (1959), and Haggard (1958). 

At a high level of mathematical sophistication are: Kendall and Stuart’s 
(1958) revision of their classical text in advanced statistical theory; a 
work on the testing of statistical hypotheses by Lehmann (1959b); two 
contributions to experimental design by Cochran and Cox (1957) and Cox 
(1958a); three books on multivariate and correlational analysis by Theo- 
dore Anderson (1958), Kendall (1957), and Roy (1957); and three 
philosophically flavored works on probability and inference by Feller 
(1957), Hogben (1957), and Jeffreys (1957). 

Detailed consideration of the theory of measurement and scaling, as 
well as of individual and group decision processes and information theory, 
is beyond the scope of this chapter. Nevertheless, attention should be called 
to a number of significant contributions. In addition to Churchman and 
Ratoosh’s (1959) provocative book concerned with the definition and 
theory of measurement, three outstanding volumes on scaling appeared: 
Torgerson’s (1958) comprehensive treatment of method; the proceedings 
of the 1958 Princeton University conference on theory and applications 
of psychological scaling edited by Gulliksen and Messick (1960); and 
Thurstone’s (1959) important collection of 27 papers on the measurement 
of values. 

In the area of decision making, there appeared, beyond the relatively 
elementary presentation by Siegel and Fouraker (1960), four volumes: 
the papers from the 1959 Purdue University symposium on information 
and decision processes edited by Machol (1960), the presentation of Luce 
(1959) concerning individual choice behavior within the framework of 
psychophysics and utility theory, and contributiens to decision theory 
and game theory by Chernoff and Moses (1959), and Luce and Raiffa 
(1957). 

In information and communication theory the most elementary and 
readable volume was Attneave’s (1959). More advanced books were pro- 
duced by Kullback (1959), who described information theory within a 
statistical framework, and Middleton (1960), who wrote on statistical 
communication theory. 

Particularly deserving of note is Kemeny, Snell, and Thompson’s (1957) 
short, elementary book that introduces the reader who has studied only 
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high-school mathematics to the ideas of modern mathematics, including set 
theory, probability, vector and matrix algebra, and elementary game 
theory. Diligent reading of this little volume will give educational re- 
searchers a grasp of recent advances in statistical thinking. Other books 
that serve a similar purpose but require greater background in mathe- 
matics are, in order of difficulty, Finkbeiner’s (1960), Hohn’s (1958), 
Murdoch’s (1957), Parker and Eaves’s (1960), and Thrall and Tornheim’s 
(1957). 

Additional References: Alder and Roessler (1960) ; Ahmavaara (1957) ; 
Ahmavaara and Markkanen (1958); Bailey (1959); Bartlett (1955) ; 
Bharucha-Reid (1960); Burington and May (1959); Bush and Estes 
(1959); Davidson, Suppes, and Siegel (1957); Edwards (1958); L. I. 
Epstein (1958) ; Fraser (1958) ; Garrett (1958) ; Goldberg (1958) ; Gold- 
farb (1960); Grenander (1959); Gumbel (1958); Halmos (1960) ; 
Hendricks (1956); Hoel (1960); Hogg and Craig (1959); Johnson and 
Rao (1959); Levens (1959); McCarthy (1957); Moore (1958); 
Quenouille (1958); Resnikoff and Leiberman (1957); Riordan (1958) ; 
Scheffé (1959): Simon (1957); Sloan (1960); Steel and Torrie (1960) ; 
Stephan and McCarthy (1958); Von Mises and Geiringer (1957); 
Williams (1959a, b). 


General Developments Primarily in Parametric Statistics 


Emphasis on parametric theory and methods was greater than that 
given to nonparametric methods. It would appear that the pendulum may 
have swung in the direction of a continuation of the development and 
extension of statistical methodology along more traditional lines. 


Statistical Inference in General 


A number of specific papers in related disciplines concerned with 
statistical analysis and inference were of interest to the behavioral scientist. 
Two important contributions were made, for example, in the biological 
sciences: Chassan (1959) discussed the development of clinical statistical 
systems for psychiatry, and Emmens (1960) described the role of statistical 
analysis in physiological research. 

General papers concerned with statistical inference were numerous. In 
an expository article based on Fisher’s well-known tea-tasting problem, 
Gridgeman (1959) defended the rationale of a hypothesized population of 
identical experiments and argued that consideration should be given to 
non-null cases in theory testing if one is to construct a satisfactory 
probabilistic model for sensory-sorting tests and to realize efficiency in 
various experimental designs. Tukey’s (1960) descriptive paper concern- 
ing paths along which experimental statistics should develop was directional 
in emphasis. As a participant in a symposium on scientific method, Taylor 
(1958) argued against use of statistical-significance tests to verify experi- 
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mental or research hypotheses that have not been logically evaluated, since 
absurd hypotheses may be supported. That errors of the first kind may be 
perpetuated in the psychological literature is evident from a survey by 
Sterling (1959) of 362 research studies appearing in four journals. In 
294 of those studies, significance tests were used with the result that more 
than 97 percent of the null hypotheses were rejected in the absence of any 
reports of replication of previously published investigations. 

In a historical discourse Welch (1958) reviewed Gosset’s work and 
its impact on statistical thinking and concluded that Student's theory 
is an improvement in large-sample theory only if the populations sampled 
approximate Gaussian form. However, in a systematic study involving 
sampling from a normal, J-shaped, and rectangular distribution embodying 
the violation of the assumptions of equal variances, Boneau (1960) demon- 
strated, by and large, a minimal effect on the distribution of ¢’s results. 
Likewise, Srivastava (1958) showed that for practical purposes the power 
of the t-test is not markedly affected even when samples are selected from 
substantially non-normal populations. That interest in Student’s theory 
has not diminished is also evident from an important extension of tables 
of percentage points of Student’s ¢ distribution by Federighi (1959), 
White’s (1957) t-test for a serial correlation coefficient, Moore’s (1957) 
two-sample ¢-test based on the range for pairs of samples between 2 and 20 
size, a tabulation by Pachares (1959) of the upper-10-percent points of the 
Studentized range, and the study by Pillai and Tienzo (1959) of the 
distribution of the Studentized extreme deviate from the sample mean along 
with determination of percentage points. 

Four other general papers on statistical inference were particularly 
noteworthy. Interested in ways in which current statistical theories can 
indicate the extent of uncertainty of an inference, Buehler (1959) developed 
some validity criteria and attempted to demonstrate the consequences of 
weakening classical assumptions concerning prior distributions. In his 
expository paper, Cox (1958c) considered problems of inferential deci- 
sions, the sample space of observations, interval estimation, significance 
tests, and the importance of assumptions. Particularly concerned with the 
amount of power that can be achieved in significance tests, Lehmann (1958) 
attempted, in his highly theoretical development, to show how significance 
levels could be chosen relative to alternative hypotheses of interest. 

Good (1958), in a provocative paper, derided the notion that significance 
must always be precise. He discussed a number of controversial problems, 
and proposed a rule-of-thumb procedure involving use of a harmonic mean 
or weighted harmonic mean of the tail-area probabilities associated with 
various significance tests on the same evidence or data. Consideration was 
given to judgments about the weights to be employed for combining the 
results of several different types of tests of statistical significance that are 
applied to the same set of data (referred to as tests in parallel), in contrast 
to the more familiar independent tests of significance composed on different 
sets of data (described as tests in series). 
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Interval Estimation . 


Work on statistical estimation, especially interval estimation, was vast. 
In an expository and highly theoretical paper, Steinhaus (1957) con- 
sidered in detail the problem of estimation. In an equally abstract paper, 
Wallace (1959b) described sufficient conditions in the realization of certain 
properties in a confidence procedure, and found that, if the confidence 
procedure at level a furnishes with respect to all samples the posterior 
probability a relative to some prior probability distribution with the 
parameter space, then there exist no subsets from the sample space for 
which “the conditional confidence is uniformly less (or greater) than a.” 
Moreover, if a sequence of prior distributions should be employed, a 
result of wider application will result, although it is slightly weaker. 
Beale (1960) gave extensive consideration to confidence regions in non- 
linear estimation. 

To distinguish between fiducial and confidence intervals, Stein (1959) 
selected an example in which—despite the apparent existence of a large 
fiducial probability—the chance of the true parameter’s being contained 
within that interval is exceedingly slight. In the instance of significance 
tests, Anscombe (1957) was able to show that the sampling rule must be 
considered in order to apply correctly R. A. Fisher’s fiducial argument. 
Supplementing discussion in his recent book (1956), Fisher (1959) com- 
pared his fiducial argument with that of Neyman and Pearson concerning 
confidence intervals, and proposed three requirements for making correct 
statements regarding mathematical probability. In a highly readable article, 
Chandler (1957) pointedly differentiated between the concepts of con- 
fidence and confidence level on the one hand and significance level on the 
other, the broad distinction being that of interval estimation and that of 
testing of hypotheses. 

Among papers dealing with more specific problems pertaining to in- 
terval estimation were two by Dunn (1958, 1959a) in which she presented 
methods for constructing sets of simultaneous confidence intervals to in- 
clude means of variables conforming to a multivariate: normal distribu- 
tion. For a (correlated or uncorrelated) bivariate normal population, 
Roy and Potthoff (1958) obtained confidence bounds on the vector ana- 
logues of the ratio of variances and the ratio of means. Other contribu- 
tions were those of Tate and Klett (1959), who found optimal confidence 
intervals for the variance of a normal distribution; Ray (1957), who 
employed a modified sequential-estimation procedure for the determination 
of confidence intervals for the mean of a normal population when the 
variance is unknown; and Banerjee (1959), who developed expressions 
for the lower bound of confidence coefficients when samples are taken 
from a non-normal population. 

The use of confidence intervals in conjunction with problems in sampling 
constituted an important feature of two papers. To handle situations in 
which a randomly selected sample may actually turn out to be undesirable 
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in a certain respect, Jones (1958) described a procedure that involves 
the calculation of confidence limits, although the disadvantage exists of 
having to specify in advance the subclass of inadmissible samples. In 
determining what the size of a sample should be relative to a designated 
width of a confidence interval that contains the parameter at a specified 
probability level, Graybill (1958) proposed a two-step sampling procedure. 


Sampling Procedures 


Although absent in statistical literature of the behavioral sciences, many 
articles on sampling appeared in journals on mathematical statistics. For 
the situation in stratified sampling in which 200 or fewer numbers are 
to be placed in 10 or fewer groups, W. D. Fisher (1958) devised a practical 
procedure based on the minimization of variance within groups that served 
to maximize homogeneity. Likewise, Dalenius and Hodges (1959) furnished 
means of minimizing variance in finite sampling. 

In sampling from both finite and infinite populations, Aggarwal (1959) 
discussed Bayes and minimax procedures and considered the allocation 
of total samples with respect to familiar loss and risk functions. Sampling 
with replacement from finite populations was the subject of a paper by Raj 
and Khamis (1958), who extended their results to multistage designs. 
Basu (1958) discussed sampling procedures both with and without replace- 
ment, whereas Stevens (1958) limited his consideration to sampling with- 
out replacement. In the use of random numbers for the selection of a par- 
ticular sample, Jones (1959) described ways for determining how many 
samples will be usable. : 

For the circumstance in which observations arise from noticeably 
different populations, Walsh (1959b) defined and described use of a 
generalized percentage point that not only guards against the acceptance 
of an erroneous assumption of the presence of a random sample, but also 
entails only slight penalty when a random sample actually occurs. Gupta 
and Sobel (1958) developed a sampling procedure for selection of a 
subset of observations by which all populations exceeding a certain standard 
are included at a specified probability level. Using a single-sample proce- 
dure, Dunnett (1960) described a minimax approach for determining 
how large a sample must be in order to associate it with the largest of 
the means of several normal populations of known equal variances and 
covariances. 

Methods of double sampling as well as multistage or sequential sampling 
were considered in several papers with particular emphasis on develop- 
ment of, or modifications in, estimators. After examining the classical out- 
comes in theory of regression and estimation from double sampling and 
extending them to finite populations, Tikkiwal (1960) rela ed various 
assumptions and determined the resulting influences on the traditional 
minimum-variance linear unbiased estimators. By means of modifying 
familiar ratio-type estimators used in sample surveys involving a large 
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number of strata, Goodman and Hartley (1958) developed an unbiased 
ratio-type estimator with an exact formula for its variance and compared 
the precision of their approach to that of other estimators. Mickey (1959) 
also furnished unbiased ratio and regression estimators in the instance 
of random sampling without replacement from a finite population. For 
multistage samples Kish and Hess (1959) described complications arising 
from the variance of ratio estimators involving two variables. Nanjamma, 
Murthy, and Sethi (1959) discussed some sampling systems that provide 
for unbiased ratio estimators, and Murthy (1957) considered both ordered 
and unordered estimates in sampling without replacement. 

Other noteworthy papers on sampling included those of DeGroot and 
Nadler (1958), who studied the behavior of Wald’s sequential probability- 
ratio test when an erroneous value of variance was taken relative to two 
applications; of Wormleighton (1960), who furnished a helpful generali- 
zation of Stein’s (1945) two-sample procedure; of Maurice (1957), who 
applied Wald’s minimax procedure to develop a sequential method of 
sampling relative to making a decision between two normal populations 
from information given by two sample means; and of Hack (1958), who, 
in his empirical study of the distribution of F-ratios in samples chosen 
from non-normal populations, showed that considerable departure from 
normality may be tolerated. 


Point Estimation 


Aside from those concerned with sampling techniques, several other 
theoretically oriented articles appeared that were concerned with estima- 
tion. Bahadur (1957) considered unbiased estimates of uniformly mini- 
mum variance; Aitchison and Silvey (1958) took up maximum-likelihood 
estimation of parameters when they were subject to certain restraints; Roy 
and Chakravarti (1960) discussed ways of estimating the mean of a finite 
population; Tate (1959) studied unbiased estimation of functions of loca- 
tion and scale parameters for distributions of the exponential type; 
and Graybill and Deal (1959) showed how a set of random variables could 
be used to form a weighted combination of unbiased estimators that would 
in turn be a uniformly improved unbiased estimator. 


Estimation in Censored Samples 


For the singly censored sample in which measures above a cutting point 
in an ordered series are omitted or missing, Saw (1959) furnished un- 
biased estimates of the mean and variance of a normal population. Earlier, 
Saw (1958) derived moments of sample moments of censored samples 
selected from a normal population. In the instance of incomplete data 
associated with both censoring and truncation, Hartley (1958) presented 
a generalized method of maximum-likelihood estimation embodying simpli- 
fied computational procedures. Making use of a single auxiliary functign 
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that is conveniently tabulated, A. C. Cohen (1959) described simplified 
estimators of the mean and variance of a normal distribution from samples 
that are singly censored or truncated. 

The problem of censoring was central to three papers involving use of 
order statistics (such as percentiles or linear combinations thereof) which 
are employed in estimation of parameters. Continuing earlier work pre- 
viously cited in the Review by Michael, Kaiser, and Clark (1957), Sarhan 
and Greenberg (1958) furnished tables in the instance of samples between 
11 and 15 in size for the estimation of location and scale parameters 
through use of order statistics from both singly and doubly censored 
samples. Subsequently Sarhan and Greenberg (1959) furnished best linear 
estimates of location and scale parameters for the rectangular population 
under conditions of Type II censoring, and included graphs to illustrate 
the influence of censoring on relative efficiency of estimates. In a related 
paper Watterson (1959) extended methods of linear estimation to various 
sorts of censored samples taken from a multivariate normal population— 
methods which corresponded to those proposed by Sarhan and Greenberg 
for the univariate case. Finally, through use of order statistics, Dixon 
(1960) offered simplified methods of estimation from censored normal 
samples. 


Estimation with Order Statistics Without Censoring 


Dixon (1957) had earlier furnished several simplified estimates of the 
mean and standard deviation of a normal population, the efficiencies of 
which were compared to the sample mean and standard deviation and also 
to the best linear unbiased estimators. Other papers concerned with order 
statistics were those of Bose and Gupta (1959), who obtained moments 
of order statistics for samples from a normal population; of Harter (1959), 
who made use of “sample quasi-ranges” to estimate the standard deviation 
of a population; and of Masuyama (1957), who employed the sample range 
to estimate the standard deviation of the variable of any type of popula- 
tion. 


Hypothesis Testing 


Although there is a workable distinction between estimation and hypothe- 
sis testing in the consideration of problems of statistical inference, the fact 
that the concepts are not independent is well illustrated in a theoretical 
paper by Aitchison and Silvey (1960), who considered maximum-likelihood 
estimation procedures in conjunction with associated tests of statistical 
significance. In another highly abstract paper, Bulmer (1957) distinguished 
between the acceptability and the confirmation of a statistical hypothesis 
depending, respectively, upon whether the hypothesis is supported by a 
significance test, or comparable procedure, or is rejected. With confirma- 


447 














REVIEW oF EDUCATIONAL RESEARCH Vol. XXX, No. 5 





tion defined in terms of a distance function in the hypothesis space that 
indicates the extent of the discrepancy of any hypothesis from the null 
hypothesis, all admissible hypotheses can be tested and then classified as 
either acceptable or unacceptable. If in terms of distance none of the ac- 
ceptable hypotheses close to the null hypothesis turns out to be “near” to 
the null hypothesis, it is declared to be confirmed; otherwise the experi- 
ment may be regarded as inconclusive, with the null hypothesis uncon- 
firmed. Applications of the rationale are presented along with prior 
discussion of the acceptability of likelihood criteria. That the standard 
likelihood-ratio test of the general linear hypothesis possesses a broad class 
of optimum properties was shown by Lehmann (1959a) to result from the 
fact that it is uniformly the most powerful invariant. 

Testing the hypothesis of homogeneity or hetereogeneity was the sub- 
ject of several papers. In an important article concerning the testing of 
homogeneity of alternatives ordered in value, Bartholomew (1959a) 
stressed that the appropriate test of a hypothesis rests on the careful 
specification of alternative hypotheses. He proposed that in place of stating 
the more general hypothesis of inequality among alternative means, for 
example, one should, in the presence of available information, specify the 
rank order of the means—a circumstance for which he furnished a solution 
involving an appropriate one-tail test. Subsequently Bartholomew (1959b) 
extended his results to more significance levels and gave percentage points 
for as many as five ordered alternatives. In an empirical investigation of 
a simple test of the homogeneity for populations composed of normal 
distributions, Baker (1958) indicated that his test would detect nonhomo- 
geneity when samples are as small as 50. 

Related papers were those of Maurice (1958), who investigated the prob- 
lem of ranking the means of two normal populations when the variances 
are unknown; Zinger and St-Pierre (1958), who furnished a means of 
selecting which mean is highest or lowest in three normal populations with 
known variances; Anscombe and Guttman (1960), who considered rules 
appropriate in the rejection of outlying values in experimental work when 
the population variance is known; and Haldane (1959), who analyzed 
hetereogeneity from the standpoint of estimating the mean and variance 
of a frequency when it varies throughout a series of samples. 

Among other papers concerned with the testing of hypotheses, Gnanade- 
sikan (1959) proposed a test of the hypothesis of equality of more than 
two variances in more than two univariate normal populations. He ex- 
tended his results to the multivariate case in which the equality of more 
than two dispersion matrices is tested against certain alternatives. To 
determine the equality of variances of two normal populations, Ramachan- 
dran (1958b) developed and illustrated a completely unbiased two-sided 
test with the property of monotonicity and furnished appropriate signifi- 
cance tables. For use when the upper bound of the standard deviation is 
known for samples from a normal population with unknown means and 
unknown variances, Colton (1960) suggested a test procedure the power 
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of which under certain conditions exceeds that of the familiar ¢-test or 
F-test. 

For the small-sample situation in which symmetric truncation of the 
normal distribution has occurred and in which a one-sided test of the 
hypothesis for the mean of the distribution is made, Aggarwal and Guttman 
(1959) showed that the loss in power decreases very quickly as a function 
both of the discrepancy between the alternative value of the mean and that 
hypothesized and of the distance of the hypothesized mean from the points 
of truncation. Assuming that the parameters of a population are known, 
Clark (1957) considered how one-tail and two-tail truncation of a normal 
population should be carried out relative to prescribed probabilities in 
order to meet specified requirements for values of sample means. 

Other papers concerned with tests of hypotheses were those of Blyth 
(1958), who considered possible definitions of relative efficiencies when 
the same hypothesis is tested through use of two sequences of tests and 
who also calculated relative efficiencies of the Student test and sign test 
against normal alternatives; of Khatri (1960), who proposed two statis- 
tics for testing the hypothesis of equality of ranges in k rectangular popu- 
lations and included a tabulation of five-percent points; of G. S. James 
(1959), who put forward a new “exact” test for weighted means that takes 
into account information furnished by the variances; of Schumann and 
Bradley (1957), who compared the sensitivities of similar experiments em- 
bodying different scales of measurement through tests of hypotheses on the 
noncentrality parameter involving two noncentral variance ratios; and of 
Dempster (1958, 1960), who developed a two-sample significance test as 
an alternative approach to Hotelling’s T? statistic. 


One-Tail Versus Two-Tail Tests 


Though less prominent than during other three-year periods, interest 
in the one-tail versus two-tail controversy persisted. Challenging the three 
criteria proposed by Kimmel (1957) for determining when one-tail tests 
can be used, Grant (1959) believed that two of the criteria invited con- 
fusion and endangered “the integrity of the rejection level,” 'and Goldfried 
(1959) somewhat discounted the importance of two of Kimmel’s criteria 
for determining the theoretical predictability and psychological meaning 
when “unexpected” results occur with a one-tail test. He implied the need 
in this situation for flexibility in the experimenter’s decision of whether or 
not to use one-tail tests depending on difficulties encountered. Shaklee 
(1957) disputed Kimmel’s statement that there is a doubled probability of 
commission of a Type I error under a two-tail hypothesis for corresponding 
significance levels. More important was Kogan’s (1960) point that the 
region of rejection need not be either equally distributed in each tail or 
concentrated in one tail of the sampling distribution; illustrated in terms 
of Ramachandran’s (1958b) finding that in the customary application of 
the F-test for equality of two independent variances involving equal-tail 
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areas, the increase in the power of the test is not monotonic as the ratio 
of the variances of the two populations departs from equality. After point- 
ing out a frequently committed logical error of making a directional sta- 
tistical decision after the null hypothesis has been rejected in a nondirec- 
tional two-sided test, Kaiser (1960c) outlined what he believed to be an 
appropriate treatment based on Wald’s statistical decision function. 


Approximation Methods 


In a general paper with important implications for and relevance to 
problems of estimation, Burkholder (1959) systematically and critically 
examined conditions that allow a best approximation to one distribution 
function by another distribution of a specified type. Somewhat more spe- 
cifically, Wallace (1959a) considered formulas that would convert upper- 
tail values of Student’s ¢ distribution, as well as chi-square variates, to 
normal deviates. To allow for the situation in which a slow-moving trend 
in the mean of a population serves to introduce bias in customary measures 
of dispersion such as sample range, sample-mean deviation, and sample 
variance, Sathe and Kamat (1957) offered four new approximate measures 
of dispersion derived from successive differences. By means of extending 
the standardized percentage points of the Pearson Type-IV curve, Merring- 
ton and Pearson (1958) succeeded in effecting a close approximation to 
the distribution of noncentral ¢. 


t 


Useful Tables and Nomographs 


Tabular preparations and graphical aids constituted an important part 
of many of the papers reviewed, and five articles were specifically con- 
cerned with such means of presentation. In order to avoid calculations 
involved in using existing rectangularly distributed observations, Quen- 
ouille (1959) made available tables of random observations derived from 
some standard distributions. The customary need for estimation of param- 
eters in bio-assay problems, a part of the procedure referred to as normit 
analysis, led Berkson (1957) to develop tables for use in estimating the 
normal distribution function by methods of normit analysis. Subsequently 
Berkson (1960) provided nomographs in order to fit the logistic function 
by the method of maximum likelihood. 

For the normally distributed random variable with unknown mean and 
known standard deviation, Barraclough and Page (1959) developed tables 
to assist in the calculation of Wald’s sequential test for the mean of a normal 
distribution. Particularly helpful in rating techniques is Moonan’s (1959) 
tabulation of the frequencies of the normal distribution relative to selected 
numbers of class intervals and various sample sizes. 

Additional References: Cox (1957) ; Dwight (1957) ; B. Epstein (1960) ; 
Eisenberg and Gale (1959); Fisher and Cornish (1960); Gjeddebaek 
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(1959); Gupta (1960); Hogg (1960); Hoyt and Krishnaiah (1958) ; 
N. L. Johnson (1958); Katz and Powell (1957); Rider (1957, 1960a, b) ; 
Ruben (1960); Tukey (1957); Turner (1960). 


Chi-Square, Contingency Tables, and Related Topics 


Emphasis on the chi-square statistic and problems posed by use of con- 
tingency tables was less than during the previous three-year period. 


Contingency Tables 


Several noteworthy statistical contributions concerning the yse of con- 
tingency tables appeared. Extending their well-known paper noticed in the 
December 1957 issue of the Review, Goodman and Kruskal (1959) fur- 
ther considered in detail the use of measures of association for cross- 
classification and included a comprehensive bibliography of 150 references. 
Of particular help to the behavioral scientist is Mayo’s (1959) definitive 
paper concerning ways in which the contingency table can be strengthened 
as a statistical method. Specifically, Mayo presented recommendations of 
how chi-square can be employed for small samples in the instance of both 
attributive and qualitative data, described various approaches to the deter- 
mination of indices of relationship, discussed at length alternative hypothe- 
ses that could be proposed and empirically verified when a significant 
chi-square value is obtained, enumerated and commented on three ap- 
proaches to the assessment of higher-order interaction, and finally sug- 
gested numerous computational procedures and graphic aids to assist the 
researcher. In a related paper, Hoyt, Krishnaiah, and Torrance (1959) 
proposed ways of analyzing complex contingency data. 

Concerned with the equivocal results arising out of index of association 
in contingency tables, Blalock (1958) furnished conditional probabilistic 
interpretations of coefficients of mean-square contingency and suggested 
another coefficient as yielding unambiguous results. Making use of work 
in information theory, Kupperman (1959) proposed a simple coefficient 
that could be used to test the null hypothesis of independence between row 
and column classifications and gave a numerical illustration. In the instance 
of a two-by-two-by-two contingency table, which seems to have been neg- 
lected in educational research, Snedecor (1958), in reply to an inquiry, 
evaluated possible outcomes in the application of alternative chi-square 
techniques that have been proposed by a number of investigators. Also 
interested in higher-order contingency tables, Kastenbaum and Lamphiear 
(1959) succeeded in generalizing Bartlett’s method to allow a test of the 
null hypothesis of the absence of any three-factor interaction in a three- 
way contingency table, although the computational efforts in estimation 
of the parameters are almost prohibitive. 

Two practical problems in the area of consumer preferences and in the 
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area of accidents and absenteeism, respectively, prompted papers by R. L. 
Anderson (1959) and Nass (1959). For the situation in which each of n 
consumers is asked to rank each of three varieties in a one-two-three order 
of preference and to record the judgments in a contingency table, Anderson 
proposed a method of analysis that takes into account the lack of random- 
ness in repeated samplings. Nass described a chi-square for handling small 
expectations where there is a large number of small samples of accidents 
or absences for one worker during two or more subperiods of a total 
period of observation. The test permits inferences of whether the absences 
or accidents of individual workers can be considered a random sample 
of the population furnished by marginal totals of the contingency table 
irrespective of whether these totals correspond to the actual lengths of sub- 
periods or some other assumption concerning the distribution of absences 
or accidents. 


Goodness-of-Fit Applications 


Considerable attention was devoted to use of chi-square as an indicator 
of the degree of goodness-of-fit. After detailed consideration of the pre- 
vious work of Chernoff and Lehmann, summarized in the December 1957 
issue of this Review, Watson (1957) made use of their findings in con- 
junction with the normal distribution, allowed the class intervals to contain, 
relative to the sample mean and variance, constant probabilities so as to vary 
with sampling, and concluded that the chi-square statistic (defined as 
> (f,-f.)*/f,, whereas chi-square stands for the well-known distribution or 
for a variable with this distribution) is distributed in the Chernoff and 
Lehmann form. Watson then proceeded to suggest that, in practice, at least 
10 class intervals be employed in order that tabular points for the well- 
known chi-square distribution can be used with an error less than 1 percent. 
Moreover, discussion was directed to the asymptotic distribution found 
when fitting is to a normal distribution. 

Extending his earlier work on chi-square goodness-of-fit tests to that of 
fitting an observed distribution to hypothetical continuous distributions, 
Watson (1958) gave detailed consideration to the number and size of class 
intervals that should be chosen, and he concluded, contrary to accepted 
practice, that many, rather than few, class intervals should be employed 
so that probabilities of inclusion in each interval are approximately equal 

a procedure rarely followed by behavioral scientists. More recently 
Watson (1959) presented some results obtained in the application of chi- 
square goodness-of-fit tests. 

Three other papers pertaining to goodness-of-fit tests were those of 
Darwin (1958), who employed the method of characteristic functions to 
effect a correction to the familiar approximations of the chi-square good- 
ness-of-fit criteria for the multinomial distribution; of Chapman (1958), 
who made a comparative analysis and evaluation of many one-sided good- 
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ness-of-fit tests; and of Lancaster (1958), who, in addition to considering 
the relationship between contingency and correlation, proposed a new test 
for the goodness-of-fit of the bivariate normal distribution. 


Miscellaneous Papers 


Discussing the Studentized smallest chi-square, Ramachandran (1958a) 
proposed a method for ascertaining whether among a set of variances the 
smallest variance is significantly less than a designated variance. To in- 
vestigate the possibility of the presence of a significant component of a 
certain type of departure from a hypothesized proportionality when the 
over-all chi-square reveals nonsignificant heterogeneity, Bodmer (1959) 
proposed an approximate test for the existence of an extreme frequency in 
a set of binomial frequencies. 

Additional References: Mitra (1958); matinee (1959); Stanley 
(1957). 


The Binomial, Poisson, and Multinomial Distributions 


Approximately 30 papers concerned with the binomial, Poisson, and 
multinomial distributions are of interest. On the binomial distribution, the 
contribution of greatest practical value was that of MacKinnon (1959), 
who furnished a concise table containing 12 probability levels of the 
symmetric binomial cumulative distribution for samples ranging in size 
to 1000, and also included useful approximation methods. Bahadur (1960) 
considered several approximations to the distribution function of the 
binomial. 

Problems of sampling and estimation were given much emphasis. In 
order to show the experimenter how large a difference and what confidence 
coefficient to choose for two binomial populations, Somerville (1957) de- 
scribed a procedure based on the minimax principle and furnished a for- 
mula for determining sample size as a function of cost of sampling and 
the cost of making an erroneous decision. For a family of binomial distri- 
butions, DeGroot (1959) developed criteria in order to achieve a workable 
sequential sampling procedure involving an optimal unbiased estimator of 
specified values for the parameter. Also concerned with sequential estima- 
tion of a binomial parameter, Armitage (1958) described a method to 
obtain confidence limits for a binomial probability and calculated unbiased 
estimates of the parameter relative to three sequential designs. Vagholkar 
and Wetherill (1960) proposed a binomial sequential probability-ratio 
test which they considered to be the “most economical.” 

Interval estimation for the parameter in the binomial model was the 
subject of a paper by Clunies-Ross (1958). In order to determine narrower 
confidence intervals than those given by classical procedures for the param- 
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eter of the binomial and Poisson distributions, Stevens (1957) examined 
several different methods. In the treatment of data embodying binomial 
responses for which the logistic curve is often used as an alternative to the 
integrated normal curve, Silverstone (1957) demonstrated that the method 
embodying maximum likelihood, but not the method of “minimum logit 
chi-square,” furnishes sufficient estimators for the logistic curve and thus 
strongly recommended the former approach in preference to the latter. 

To investigate a practical problem in the combining of accident fre- 
quencies, Tanner (1958) used a binomial model. Using a Poisson approxi- 
mation to the binomial, Buehler (1957) developed, for small probabilities 
of failure and for samples of moderate size, an approximate method for 
estimating confidence intervals involving the product of two binomial 
parameters and furnished tables of intervals relative to the 90-percent and 
95-percent confidence levels. For the situation in which an erroneous ob- 
servation or report yields c defective samples when actually the number ‘is 
e + 1, A. C. Cohen (1960d) employed the method of maximum likelihood 
to estimate the binomial parameter. 

For N binomial samples of the same size, the relative frequencies of 
which have been ordered in value, Chassan (1960) developed an expression 
with respect to a significant level a for the upper bound of the probability 
that the particular observed ordering of values under the null hypothesis 
could arise by chance. In the comparison of several rates or proportions, 
such as relative frequency of lung cancer in smokers and nonsmokers, 
Sheps (1959) examined several models and suggested a general method 
of estimating parameters. For the difference between binomial probabili- 
ties, MacKay (1959) offered asymptotically efficient tests based on the 
sums of observations. 

The estimation of parameters in a truncated, a conditional, and a modi- 
fied Poisson distribution, respectively, was the subject of three papers 
by A. C. Cohen (1960a, b, c). Related studies were those of Sprott (1958), 
who applied the method of maximum likelihood in estimation procedures 
concerning the Poisson binomial distribution; of Irwin (1959), who con- 
sidered the estimation of the mean of a Poisson distribution from a sample 
for which the zero class is absent; of Tate and Goen (1958), who for 
truncation at the left of this distribution furnished minimum variance un- 
biased estimates; of Crow and Gardner (1959), who in the estimation of 
a Poisson variable presented tables of two-sided confidence intervals rela- 
tive to several confidence coefficients and all values of the variable from 
0 to 300; and of Chakravarti and Rao (1959), who provided tables for 
several small-sample significance tests for the Poisson distribution as well 
as for two-by-three contingency tables. 

Use of the multinomial model in decision making and classification was 
treated in three papers, those of Bechhofer, Elmaghraby, and Morse (1959), 
Kesten and Morse (1959), and Wesler (1959). In the first, attention was 
given to the selection of the multinomial event with the highest probability. 
In two closely related papers, Rao (1957, 1958) considered maximum 
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likelihood estimation for the multinomial distribution. Johnson (1960) 
described properties and applications of an approximation to the multi- 
nomial distribution. 

Additional References: Crow (1958) ; Johnson (1959) ; Mendenhall and 
Lehmann (1960); Ramasubban (1958, 1959). 


Developments in Nonparametric Statistics 


Although attention given to nonparametric statistics was somewhat less 


than during the previous three-year period, a substantial number of papers 
appeared. 


General and Theoretical Papers 


Arguing against the compulsive use of nonparametric methods in place 
of classical parametric methods and for retention of the latter, despite the 
failure of the scale of measurement to be interval in form, were Cox 
(1958c), Gaito (1959), and Kogan (1960). Believing that nonparametric 
techniques should serve as exploratory or screening devices, Gaito urged 
that they be given limited use; Kogan (1960) cited the more extensive util- 
ity of parametric methods, and also the strikingly “rapid potency of the cen- 
tral limit theorem.” On the other hand, in a detailed and systematic paper 
concerning a distinction between approximate and exact methods in non- 
parametric statistics, Sawrey (1958) strongly implied the importance of 
properties of the scale of measurement in the decision of the appropriate 
use of an exact or approximate nonparametric test, as did Senders (1958) 
much more explicitly in her textbook. 

Among the most important theoretical papers were those of Savage 
(1957), who for various trend hypotheses considered detailed relationships 
among the probabilities of rank order with implications for admissibility 
of rank-order tests; of Rao, Savage, and Sobel (1960), who took up the 
two-sample censored case; and of Savage (1960), who furnished rules 
for the computation of rank-order probabilities with particular reference 
to the determination of the efficiency of Wilcoxon’s two-sample test rela- 
tive to the standard-normal test and t-test. 

Making use of order statistics and of only the assumption that continuity 
exists in the marginal distributions, Dunn (1959b) developed estimates 
of joint (bounded) confidence intervals for the medians of a bivariate 
population. For the case of two independent samples, Birnbaum and 
McCarty (1958) described a numerical procedure based on an extension 
of the Mann-Whitney formulation for determining how large a sample 
would have to be to yield a distribution-free one-sided confidence interval 
of given width and specified level. Previously Birnbaum and Klose (1957) 
had given bounds for the variance of the Mann-Whitney statistic. To de- 


termine nonparametric tolerance limits, Somerville (1958) furnished use- 
ful tables. 
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Theoretical papers concerned with the concepts of power and efficiency 
included that of Chernoff and Savage (1958), who, for two absolutely con- 
tinuous cumulative distributions of two sequences of ordered observations, 
considered properties of asymptotic normality and efficiency of several 
nonparametric tests following a certain form and that of Fraser (1957), 
who, in making use of an invariance principle, derived what he termed to 
be the most powerful tests for ranked data relative to normal alternatives. 
Witting (1960) developed a generalized efficiency measure for nonpara- 
metric tests based on the Pitman approach. 


Significance Tests 


Articles concerned with new nonparametric tests or with modifications 
of existing ones were numerous. Without making an assumption of either 
continuity or independence, Kuang (1960) derived a probabilistic inequal- 
ity that indicates whether two samples can be contained in a certain class 
of distributions. In an ingenious development, Tukey (1959) proposed an 
easy and quick test to determine whether or not two independent samples 
come from the same population. He proposed summing the number of values 
in the first group (designated as the sample with the largest scale value) 
that exceed all values in the second group and then adding to this fre- 
quency the number of values in the second group not reaching the smallest 
value of the first group. For a two-tail test the critical frequencies in the 
total count are 7, 10, and 13 at the 0.05, 0.01, and 0.001 levels. To facili- 
tate the comparison of changes in an experimental group with those of a 
control group, Silverstein (1958) urged the use of existing nonparametric 
tests once the differences between measures (changes) have been ordered. 

In an expository and historical paper, Darling (1957) discussed the 
Kolmogorov-Smirnov and Cramér-Von Mises tests from the standpoint 
of goodness-of-fit and comparison of two samples. Other contributions 
pertaining to the Kolmogorov-Smirnov statistic included papers by Carvalho 
(1959), who presented a new derivation of the distribution of the statistic; 
by David (1958), who developed an adaptation of the test for three sam- 
ples; and by Kiefer (1959), who furnished k-sample analogues for both 
the Kolmogorov-Smirnov and the Cramér-Von Mises tests. 

Much attention was given to tests for comparing independent samples. 
In an informative article Kruskal (1957) described in historical perspective 
five independent proposals that anticipated Wilcoxon’s unpaired two-sample 
test. For testing the hypothesis that two independent populations are unlike 
only in location, Sukhatme (1958a) studied the asymptotic behavior of the 
Mann-Whitney U statistic. Subsequently, Sukhatme (1958b) furnished a 
new nonparametric test for comparing variances and described a formula 
for its asymptotic relative efficiency. Also noteworthy in the instance of two 
independent samples was Halperin’s (1960) extension of the familiar tests 
of Wilcoxon and of Mann and Whitney for samples that are censored at 
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identical fixed points, and he developed significance tables for sample sizes 
less than or equal to eight and for several degrees of censoring. 

Resembling a one-way analysis of variance procedure for the comparison 
of several treatments with a control group when the numbers of observation 
are all equal is the multiple-comparison rank-sum test proposed by Steel 
(1959a), who discussed both the exact and approximate distribution and 
presented an example as well as a tabulation of critical values. In a later 
paper Steel (1960) proposed and illustrated a multiple-comparison rank- 
sum test that permits the simultaneous comparison of all possible pairs of 
treatments in a one-way classification when the numbers of observation are 
equal for all treatments. In a similar vein, Wallace (1959c) furnished an 
improved beta approximation to the Kruskal-Wallace test for a one-way 
analysis of variance ranks. 

For matched samples, several innovations on the sign test and signed- 
rank test appeared. With respect to the hypothesis that the medians of two 
not necessarily independent variables have a particular value, Blumen 
(1958) developed a new bivariate sign test. In the instance of Hodges’s 
bivariate sign test, Klotz (1959) obtained the complete null distribution 
from n equal to | through 30. Of considerable help to researchers are the 
tables for the sign test prepared by Arthur Cohen (1959) that furnish 
maximum likelihood estimates of binomial parameters when the probabili- 
ties differ from one-half. Extending the two-sample sign test to k-variate 
distributions involving three or more matched samples, Wormleighton 
(1959) proposed a test statistic based on study of tests of permutation 
symmetry that with an asymptotic chi-square distribution contains more 
degrees of freedom than Friedman’s test and offers sensitivity to a larger 
variety of alternatives. Related papers were those of Steel (1959b), who 
developed a multivariate sign test, and of Walsh (1959c), who presented 
an exact nonparametric model in the instance of randomized blocks. 

With regard to the signed-rank test, Walsh (1959a) attempted to elarify 
certain recent misunderstanding concerning the equivalence of Wilcoxon’s 
test to a subclass of some tests that he had proposed previously and went 
on to demonstrate that his nonequivalent results contain useful properties 
superior to the information furnished by Wilcoxon’s procedures. For the 
Wilcoxon signed-rank procedure, Pratt (1959) described ways for handling 


zeros and ties. 


Tests of Randomness 


That the study of randomness constituted an area of major interest was 
apparent from several papers. Two closely related studies by Barton, David, 
and Mallows (1958) and by Barton and David (1958) treated application 
of Wilcoxon’s and the rank-test statistics, respectively. The substantive 
aspect of the two papers is a paired comparison task concerned with a 
sequence of two alternatives, such as the requirement for a judge to rank 
in order of age the pictures of N, men and N, women when in actuality all 
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individuals are of the same age. Obviously the null hypothesis of random- 
ness is appropriate to ascertain whether the existence of bias on the part 
of the experimenter to judge women to be of higher or lower age can be 
inferred. 

Likewise Goodman (1958) proposed a simplified-runs test and likeli- 
hood-ratio test of randomness in a sequence of two or more alternatives 
that could be simplified to significance tests similar to those applied to 
determine independence in contingency tables. In a paper related to the 
problem handled by Hotelling’s T? statistic, Chung and Fraser (1958) 
proposed several nonparametric randomization tests for the multivariate 
two-sample problem—on the doubtful assumption, however, of independ- 
ence among the variables—and offered a method for simplifying computa- 
tions in the instance of larger samples. 


The Tau, Rho, and Other 
Nonparametric Coefficients of Association 


The most important article concerning measures of association was that 
by Kruskal (1958), who, in emphasizing both the probabilistic and opera- 
tional interpretations of population values in conjunction with rank meas- 
ures of association for bivariate populations, discussed comprehensively 
the quandrant measure, Kandall’s tau, and Spearman’s rho. Kruskal de- 
scribed their interrelationships, as well as their connections with certain 
measures of association found in cross-classifications; surveyed underlying 
sampling theory; developed an informative historical frame of reference; 
and stated his preference for use of tau instead of rho. Having 25,000 sets 
of correlated random normal deviates available, Fieller, Hartley, and Pear- 
son (1957) investigated and compared the sampling distributions of three 
measures of rank correlation—Spearman’s rho, Kendall’s tau, and the 
Fisher-Y ates index. , 

Studying the rank analogues of the familiar product-moment partial 
correlation, Somers (1959) showed that the ordinary product-moment 
correlation coefficient, rho, and tau are specialized cases of a generalized 
coefficient. Taking Somers’s paper as a point of departure, Goodman (1959) 
presented significance tests appropriate to a number of different partial 
correlation coefficients that are related to tau. 

Hays (1960) set forth an alternative measure of concordance which, 
though parallel to Kendall’s coefficient W, is a function of the average 
Kendall tau coefficient among all possible pairs of judges; he suggested 
a significance test of this concordance index. To measure association in 
a contingency table with ordered categories, Karon and Alexander (1958) 
proposed a modification in Kendall’s tau coefficient. Easing the computa- 
tional effort in calculation of tau is Griffin’s (1958) simple graphic method. 

For the calculation of an average Spearman rho correlation between 
rankings on a criterion measure and a set of m independently made rank- 
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ings corrected for ties, Cureton (1958) developed and illustrated a formula. 


In the instance of nominal scales, J. Cohen (1960) presented a coefficient 
of agreement. 


Regression and Correlation 


In correlation and regression theory, an important general theoretical 
paper was that by Kiefer and Wolfowitz (1959), who considered optimum 
experimental designs and computational procedures in regression problems 
of estimation and testing of hypotheses from the standpoint of several cri- 
teria. Box and Draper (1959) described a design for fitting a polynomial 
to a true function with minimum error over a specified region. 

For the situation in which samples are taken from bivariate non-normal 
populations, Srivastava (1960) carried out a theoretical investigation of 
the sampling distribution of regression coefficients. Additional contribu- 
tions to the distribution theory were those of Williams (1959), who pre- 
sented an approximate significance test for the difference between two non- 
independent correlation coefficients; of Hooper (1958), who investigated 
asymptotic variances of canonical correlation coefficients with applications 
to cases of both zero-order and multiple-correlation coefficients; and of 
James (1960), who studied the distribution of the latent roots of the co- 
variance matrix. 

In two other theoretical papers on multivariate analysis, Pillai and 
Samson (1959) developed expressions for the moments of Hotelling’s 
generalization of T*, and Lawley (1959) obtained results pertaining to 
the approximate distribution of canonical correlation coefficients. 


Estimation of Parameters 


Making use of the theory of least squares in relation to problems con- 
cerning linear hypotheses in multivariate analysis, Rao (1959) furnished 
estimates of parameters as well as test criteria when the variances and 
covariances, though unknown, can be estimated. Since the distribution prob- 
lems pose no particular difficulties, valid inferences can be made by means 
of reference to available significance tables of t and F. Nicholson (1957) 
concluded that no use should be made of incomplete multivariate samples 
in problems of prediction, although under certain circumstances all the 
observations in an incomplete sample can be used to construct improved 
estimators. 

For two variables with a bivariate normal distribution, Olkin and Pratt 
(1958) developed unbiased estimates of certain correlation coefficients and 
included tables to facilitate the process. Replacing a least-squares estimate 
by one quickly determined, Barton and Casley (1958) pointed to the 
applicability of the latter index to certain topics of censored data and its 
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consistency under the condition of a structural, rather than a regressive, 
relationship between the dependent and independent variables. 

In the instance of an exponential function, Finney (1958) described 
methods for estimation of a key parameter. In two particularly interesting 
papers based on economics research with implications for psychology, 
Quandt (1958, 1960) considered problems of estimating and testing 
hypotheses about parameters in a linear-regression system when, at a 
certain value for the independent variable, such as time, there is a suspected 
switch in the trend of the relationship. 

For the situation in which two regression lines intersect, Kastenbaum 
(1959) furnished a confidence interval. The construction of confidence 
intervals with respect to arbitary real functions of multiple correlation 
coefficients was the subject of a note by Mandel (1958). Roy and Gnan- 
adesikan (1957) also devoted efforts to finding multivariate confidence 
bounds. 


Serial Correlation 


There was less interest in problems of serial correlation than during the 
previous three years. Weinstein (1958) considered various definitions of 
the serial-correlation coefficient relative to the estimation of autoregressive 
parameters from a short time series, and, in terms of his examination of 
estimates obtained from three experimental series, concluded that the esti- 
mates are less influenced by changes in definition of serial correlation than 
by differences in basic method of estimation. He then introduced a new 
definition of the serial correlation. Siddiqui (1958) studied the distribu- 
tion of a serial correlation, and McGregor (1960) proposed an approxima- 
tion test for serial correlation in polynomial regression. 


Regression Analysis and Prediction 


A variety of problems was studied in the area of regression analysis 
and prediction. In the prediction of a continuous dependent variable from 
several independent variables (some of which are assigned dummy values 
corresponding to membership classifications as in a nominal scale), Suits 
(1957) described restraints that must be imposed upon the parameters of 
the regression equation in order that determinate estimates can be ob- 
tained. Cox (1958b) described methods of regression analysis for instances 
in which the dependent variable can assume only two values such as 0 and 
1; subsequently he (1958d) extended application of his model to the 
analysis of two-by-two contingency tables involving matched pairs and to 
testing the extent of agreement between a binary sequence of observed 
values and a corresponding sequence of probabilities. 

For treatment of certain types of experimental data, Williams (1958) 
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described how simultaneous regression equations could be employed, and 
expressed a preference for their use to use of the discriminant function. In 
estimating the regression coefficient of y on x, Cox (1960) showed how 
increased precision can be realized when prior information on a supple- 
mentary variable to which y and «x are related is available. In a similar 
vein, Seal (1959) furnished and illustrated a model for a sampling plan 
that permits the obtaining of measures on an expensive variable from 
knowledge furnished by an inexpensive auxiliary variable witl. which the 
former variable is highly correlated. 

Two contributions in regression analysis of particular interest to psy- 
chologists appeared. In their expository treatment of path analysis, Turner 
and Stevens (1959) used simple diagrams to explain the conceptual nature 
of cause and effect in regression analysis as well as to describe properties 
of feedback and homeostasis. In a critical discussion of their paper, Wright 
(1960) advocated the systematic substitution in path analysis of what he 
considered dimensionless path coefficients by corresponding concrete path 
regressions. The problem of covariance was the subject of the second im- 
portant paper, in which Lord (1960) in the instance of large samples, pre- 
sented formulas to allow for the fallibility of measures in the control 
variable—a circumstance that is particularly pertinent to many investiga- 
tions in education and psychology. 

How to select a limited number of predictor variables from a larger set 
in regression analysis was the central problem of three papers. Comparing 
both theoretically and numerically the Doolittle, the Wherry-Doolittle, and 
the Summerfield-Lubin methods of multiple correlation, Anderson and 
Fruchter (1960) demonstrated the equivalence of the latter two in selecting 
the same set of predictors in the same order and recommended the use of 
the Summerfield-Lubin formulation as the best least-squares procedure 
in view of its computational ease, compactness, and clarity of interpreta- 
tion of interim values. Making use of the expected value of the size of a 
confidence interval over all possible regression samples and over all pos- 
sible sets of predictors as a basic criterion for the precision of a selected 
set of predictor variables, Linhart (1960a) furnished and illustrated his 
method for determining which set of r variables out of k available ran- 
domly distributed ones should be chosen. In a related paper, Linhart 
(1960b) subsequently evaluated criticisms concerning the choice of a 
measure of predictive precision in regression analysis, and referred par- 
ticularly to use of the expected value E(1) of a confidence interval of 
width 1 for the variable to be predicted. 


Computational and Graphic Aids 


Allied to the three papers just cited were several others helpful in sim- 
plifying or reducing computational labor in regression analysis. The table 
prepared by Steck (1958) for computing trivariate probabilities has long 
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been needed. Greenberg and Sarhan (1959) discussed applications of 
matrix inversion in the analysis of correlational data. To obtain higher- 
order regression coefficients, Cowden (1958) described analogues to a 
method by which higher-order partial correlation coefficients are calculated 
from those of a lower order. Foote (1958) presented a simple desk-calcu- 
lator method of obtaining multiple and partial correlation and regression 
coefficients that involves no back solution. In the analysis of several num- 
bers of measures on the same individuals, Schutz (1960) presented a labor- 
saving technique that he referred to as the “little jiffy correlator.” 

For the determination of a multiple correlation coefficient R, »,, Waugh 
and Fox (1957) demonstrated a graphical method, and, in the instance 
of moving averages and adjustments in time series, Mincer (1957) showed 
how a graphical approach could be employed. 


Curve Fitting 


Contrary to the oversimplified impression conveyed by most textbooks, 
the fitting of a linear-regression equation customarily rests on the assump- 
tion of fixed, or error-free, values in the independent variables. That the 
fitting of a regression line is not a simple and mechanical process has been 
clearly set forth in a definitive and penetrating article by Madansky (1959), 
who considered the implications of presence of error in both the inde- 
pendent and dependent variables. He surveyed and evaluated solutions 
for determining consistent estimates of slope and intercept of regression 
lines from samples of paired observations when various assumptions re- 
garding the properties of error and when various types of information are 
available for constructing consistent estimates. Pertinent to the problems 
just posed are the contents of the previously cited paper by Barton and 
Casley (1958), who furnished improved though rapid estimates of regres- 
sion coefficients. On the other hand, in choosing to ignore the presence of 
error in the independent variable but to allow error in the dependent 
variable to be randomly distributed, David and Arens (1959) proposed 
criteria for spacing a given set of paired observations to achieve an op- 
timal straight line. After pointing out that the true line may vary from ex- 
periment to experiment when a sequence of observations is taken over the 
same set of values on the predictor variable because of the presence of un- 
controlled factors from one set of runs to another, Scheffé (1958) furnished 
a mathematical model for fitting the line relative to the hypotheses of 
equality of slopes of the true lines or an identity of the true lines. 

For certain types of cumulative data in which the error of successive ob- 
servations may not be independent, Mandel (1957) described two models 
for both independent and cumulative data and through them showed that 
the frequently used least-square estimates of independent error derived 
from the first model are not applicable to cumulative data conforming to 
the second model. In order to achieve a smoothing of probability-density 
functions, Whittle (1958) developed an equation that determines an op- 
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timal-weighting function. Citing certain pedagogical advantages, Karst 
(1958) described a method of linear curve fitting by means of which the 
sum of the absolute values of vertical discrepancies of points to the line 
is a minimum. Wagner (1959), also rejecting the least-squares approach, 
suggested two alternative criteria in linear-programing techniques for re- 
gression analysis. Askovitz (1959) presented short-cut techniques embody- 
ing centroids of sets of points that could be utilized in least-square applica- 
tions of line fitting and in the determination of the mean of a frequency 
distribution. 


Miscellaneous Articles 


Making use of a table of random numbers, Hoffman (1959) described 
a procedure for constructing pairs of variables so that their correlation 
will be equal to any specified predetermined magnitude. For N_ distribu- 
tions of variates, each based on the same population, Willis (1959) derived 
lower-bound formulas for the mean intercorrelation coefficient. 

Representing a substitute for Fisher’s well-known z conversion pro- 
cedure, Nair’s transformation of a correlation coefficient was studied by 
Sankaran (1958), who concluded that a corresponding inverse-sine trans- 
formation of this new coefficient in several situations is as satisfactory as 
Fisher’s transformed coefficient. After introducing a family of modifica- 
tions to Fisher’s transformation of a correlation coefficient, Laubscher 
(1959) was able to show that within the family Fisher’s form of the 
transformation is optimum. 

Additional References: Finney (1960) ; Guttman and Guttman (1959) ; 
Ostle and Steck (1959); Rao (1958); Schaie (1958) ; Williams (1959a) ; 
Wright, Manning, and DuBois (1959). 


Factor Analysis 


Some of the problems involving communality estimation, rotation, and 
factorial invariance were clarified, and reformulations of the factor prob- 
lem were offered. More researchers in education and psychology than previ- 
ously used factor-analytic techniques. This section covers the most impor- 
tant methodological studies. 


Communality Estimation and the Number of Common Factors 


The problems of communality estimation and the number of factors to 
extract are inherently related, since estimation of the communality is made 
in order to define the common factor space. It can be safely stated that no 
solution for the communality problem has been found—and indications 
are that under the classical formulation of the problem none will be found. 
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Thus it is only natural that factor analysts turn toward redefining the 
concept of communality or toward attempting to determine the conditions 
under which prevailing attitudes do not seriously affect the psychological 
interpretations of factor structures. 

Criticisms of the current factor theory are based on the following 
reasons: (a) formulation of the communality problem on the basis of 
minimal rank of the correlation matrix (Cattell, 1958; Guttman, 1958a; 
Wrigley, 1957a, 1959); (b) a@ priori acceptance of Thurstone’s idea of 
parsimony (Guttman, 1958a, b); (c) definition of the term communality 
(Wrigley, 1957b); and (d) failure to take into account the stochastic 
properties of the measures used and their effect upon the final factorial 
structure (Wrigley, 1959). 

In the use of communality estimates, Guttman’s classic lower bound as 
given by the squared-multiple correlation between one variable and the 
n — | remaining variables seems to be firmly entrenched (Cattell, 1958; 
Tryon, 1957a; Wrigley, 1957a, b, 1958, 1959). After reviewing the com- 
munality formulations of Spearman, Thurstone, Guttman, and Tryon, 
Wrigley (1957b) stated that the squared-multiple correlation coefficient as 
an estimate of the communality in many ways overcomes the previous 
problems of communality estimation, because there are no alternative sets 
of diagonal values; the squared-multiple coefficient is probably less in- 
fluenced by sample size (thus overcoming some objections to stochastic ap- 
proaches), since values calculated from larger samples may be higher or 
lower. Thus the problem of determining the communality becomes separate 
from that of determining the number of factors. Wrigley pointed out the 
disadvantage of the lack of a test of significance, but left the solution of 
this problem to the statisticians. He also indicated the use of the squared- 
multiple correlation as an initial estimation of the communality in order 
to reduce the time involved when using Lawley’s maximum-likelihood 
methods (Wrigley, 1958) and its correction by subtracting from such a 
diagonal value the mean of the rejected latent roots. Although Wrigley 
had stated that such use of the squared-multiple correlation produces good 
results, he pointed out that the communality might converge to an unlikely 
value, and that, since the communality based upon maximum likelihood 
depends on the size of the sample of persons, the twin decisions of com- 
munalities and the number of factors is confounded. 

Wrigley (1959) also attempted to approach the communality problem 
as a function of the number of factors being extracted. His results indicated 
irregular increases in the communality as more factors were postulated. 
Moreover, while using Guttman’s lower bound as an initial estimate, he 
found that some communalities eventually became greater than unity. 
The rate of convergence of the communality estimates appeared fastest 
when a small number of factors was hypothesized. In some cases the values 
had not converged after 80 iterations. His study also indicated that the 
last value obtained for the communality was a function of the initial com- 
munality estimate. 
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Approaching the problem of common-factor space from a different angle, 
Cattell (1958) called for the clarification of error factors and real factors, 
population factors and sample factors. Thus the problem is not reduced to 
attaining minimal rank nor to finding the standard error of a factor load- 
ing. In the process of separating real and error factors Cattell suggested 
that initial estimates of the communalities be based on the squared-multiple 
correlation and that concern be given to reproduction of the off-diagonal 
elements of the correlation matrix. 

Guttman (1958a, b) criticized the desire that factorial research be based 
on the Thurstonian concept of parsimony, citing that all evidence which 
has been collected by methods outlined by Thurstone points toward nega- 
tion of this idea. Guttman disproved that the rank of a correlation matrix 
may always be reduced to the smallest integer greater than 42(2n + 1 — 


V8n — 1), but rather that the best possible upper bound is n — 1. He 
also postulated that for a simplex the rank can never be less than n — 2, 
but DuBois (1960) succeeded in providing an example in which Guttman’s 
contention is not upheld. Further evidence by Kaiser (1960a) indicated 
that the alpha-reliability of a factor is depressed when communality esti- 
mates are made and when the number of factors extracted is small. In this 
light Kaiser suggested use of unities in the diagonals, a principal-axis 
solution using all factors having latent roots greater than one (1960b). 

An attack upon the communality problem through a redefinition and 
generalization of the common-factor theory was proposed by Tryon 
(1957a, b, 1958a, b, 1959). Under his reformulation, Tryon (1957a) 
maintained that his three definitions lead to “precise” formulas for the 
determination of the communality (a) from the k necessary and sufficient 
dimensions derived by iterative factoring, (b) from the n — 1 remaining 
variable-domains, and (c) from the k’ multiple clusters of the n variables. 
Kaiser (1959b), however, pointed out that convergence necessary under 
Tryon’s formulation cannot always be reached with empirical matrices; 
in fact the solution for communalities converges if and only if the matrix 
has unique minimum-rank communalities. For clarification of the numerous 
cluster-analysis techniques, key-cluster analysis, total communality, cumu- 
lative communality, preclustered cumulative communality, and rational 
cumulative communality, the reader is referred to Tryon (1959). 

Tyler and Michael (1958) reported an empirical investigation concern- 
ing the problem of communality estimation and concluded that for the 
matrix under study there appears to be no loss of psychological meaning- 
fulness when either estimates of communalities determined iteratively or 
unities are used. Cattell (1958) also suggested the decreasing importance 
of the exact value of the communality from the empirical point of view if 
correlation matrices are large, when he indicated that the diagonal elements 
are only 1/nth the number of elements in the matrix. However, Cattell 
indicated that the use of ones in the diagonal is unscientific, since it does 
not permit the separation of error factors from the common factors. An 
excellent review of the communality problem, its problems and meaning, 
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can be found in Dickman’s (1960) dissertation, where a clear distinction 
is made between matrix, domain, and population-factor analysis. 

For the researcher seeking to solve the problem of what should be placed 
into the diagonals, no exact answer can be given. Indications are that the 
insertion of Guttman’s lower-bound value, the squared-multiple correlation, 
and the use of principal components having positive latent roots is the 
most reasonable and scientific approach at this time. 


Rotation 


During the last three years the formulation of one of the most adequate 
solutions to the problem of orthogonal analytic rotation has occurred. 
Unfortunately, the oblique case, although studied with much vigor, has not 
as yet provided researchers with a fully acceptable criterion despite Car- 
roll’s (1957) important advance. 

The major significant breakthrough in the orthogonal case was given 
by Kaiser (1958) with his varimax criterion. The varimax criterion maxi- 
mizes the variance of the squared elements by columns after each test has 
been corrected for uniqueness. (The correction is removed after rotation is 
completed.) Kaiser pointed out that not only is Thurstone’s criterion of 
simple structure attained, but also the more important characteristic of 
factorial invariance is realized. Indications are that the varimax criterion 
has become well accepted by researchers using factor analytic techniques 
(Comrey, 1959; Dickman, 1960). Since the criterion to be maximized has 
been outlined for purposes of computer programing (Kaiser, 1959a), it 
should be made part of a basic computer library for those engaged in 
factor analyses (Kaiser, 1960b). Although Kaiser also generalized the 
varimax criterion to the oblique case (covarimin), indications are that it 
is biased in that the factors seem to approach a position tending toward 
orthogonality (Carroll, 1958). 

After reviewing the problems involved with the analytic criterion as 
defined by his original quartimin and Kaiser’s covarimin, Carroll (1958) 
proposed that a combination of these two criteria be used, for the quartimin 
produced an opposite bias to that of the covarimin criterion. Since the first 
term of Kaiser’s covarimin function is the same as the complete quartimin 
function, Carroll proposed subtracting one-half the product of the sums of 
squared loadings from the covariance term. This new criterion for rota- 
tion is called biquartimin. Although it has been reported that the biquar- 
timin has been very successful on Thurstone’s box problem (Dickman, 
1960), it has been suggested that the split between quartimin and co- 
varimin is not equal and that, when the data are more complex, a larger 
part of the covarimin function is required. The biquartimin criterion 
requires that the sum of cross-products of squared factor loadings are mini- 
mized along with the sum of cross-products of deviations of squared factor 
loadings from their mean value. 
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After reviewing the inadequacies of previous analytic criteria (Carroll's 
quartimin as being too nearly oblique, Kaiser’s covarimin as too nearly 
orthogonal) and pointing to the necessity for a variation in the proportion 
of combined criteria which change with the complexity of the tests, Kaiser 
and Dickman (1959) provided a new solution to the oblique case called 
binormamin in which the simplicity coefficient, or value determining the 
combinations of the covarimin and quartimin, varies as a function of the 
simplicity of the structure under study. The new approach uses the normali- 
zation step which has proved so successful in the varimax criterion. In 
effect, binormamin minimizes the sum of the cross-products of squared 
loadings normalized by rows and columns over all pairs of factors. The 
authors reported that binormamin is also subject to a bias which appears 
to be a function of the complexity of the tests. The bias is considered to be 
essentially nonexistent when data are cleanly structured but to be distinct 
for problems containing factorially complex variables. 

Cattell and Muerle (1960) severely criticized present rotational criteria 
as constituting entirely the wrong functions to be maximized or minimized. 
Cattell stated that both simple structure and orthogonality are incompatible 
except in rare cases and that orthogonality is of little use for scientific 
work. Cattell proposed that the hyperplane count be maximized, that is, 
the number of near-zeros in the factor columns. Essentially Cattell pro- 
posed a compromise between subjectivity and mathematical rigidity. 

With the advent of numerous analytic criteria for rotation and a greater 
stress upon psychological meaningfulness, a number of studies have been 
conducted to compare and to evaluate the effectiveness of each method. 
Wrigley, Saunders, and Neuhaus (1958) compared the quartimax rotation 
of the centroid factor loadings for Thurstone’s Primary Mental Abilities 
Test Battery with the Thurstone simple structure method, Zimmerman’s 
revised simple structure, Holzinger and Harman’s bifactor analysis, and 
Eysenck’s group factor analysis. The evidence indicated that the quartimax 
results agree very closely with Holzinger and Harman’s and Eysenck’s 
solutions and only moderately well with the two simple structure solutions. 
The authors pointed out further that, in terms of parsimony, the advantage 
seems to be with quartimax but that, in terms of factorial invariance, the 
varimax solution is much superior. Further study is required. 

Kaiser (1960d) conducted a similar study of analytic rotations exclud- 
ing both the Holzinger and Harman method and the Eysenck method from 
the comparison. Results indicated that psychologically there appeared to 
be no difference in the rotated solutions. However, it was emphasized that 
the merit of the varimax lies not in the observed similarities, but rather 
in the fact that the varimax is based on a scientifically more fundamental 
and more important criterion—factorial invariance. Further study with the 
varimax criterion was made by Comrey (1959) using data from the Min- 
nesota Multiphasic Personality Inventory. Comrey reported that the vari- 
max in general is more satisfactory than Thurstone’s method and that, if 
a choice is available, the varimax rotation should be preferred. He also 
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suggested that it be used prior to oblique rotation when factor plots are to 
be made. 

Of interest to those who do not have available electronic computers is 
the study by Fruchter and Novak (1958) who compared the 2 x 2 graphi- 
cal method, Thurstone’s analytic method, and a “direct rotational” method 
devised by Harris. Using as their criterion the principle of simple struc- 
ture, the authors found that the graphical method is superior, but that the 
direct rotational method is the most economical of the researcher’s time. 
Further work with the rotational method devised by Thurstone has been 
conducted by Sokal (1958). Sokal found Thurstone’s results unsatisfactory 
because in some circumstances a given trial vector would not yield a 
reference vector with a well-defined hyperplane. This occurred because of 
the problem of collinear vectors, which makes it impossible to assure the 
correct selection of a reference vector for purposes of simple structure. 
Sokal’s modification, which is restricted to calculation by electronic com- 
puters, considers all the variables simultaneously. 

Working from the Gram-Schmidt method of establishing an orthonor- 
mal basis, Cureton (1959) adapted this useful mathematical technique to 
determine a transformation matrix that will rotate factors so that one of 
the new axes may be placed in a predetermined position. Such a technique 
may prove useful when a battery of tests includes “marker” variables. 

In summary it may be stated that, for the orthogonal case, the normal 
varimax criterion appears to be in greater use than other orthogonal pro- 
cedures. Unfortunately, since generalization to the oblique case has only 
been made by attempting compromises with previous orthogonal criteria, 
a definitive criterion is yet to be found. Electronic computers are an abso- 
lute necessity if current trends in analytic rotation continue. 


The Common-Factor Problem 


Increased use of factor analytic techniques has brought about the neces- 
sity for the comparison of factors among different studies. For the case 
involving different but not parallel tests for the same group, Tucker (1958) 
proposed the “interbattery” method of factor analysis. Tucker’s procedure 
utilizes the correlation matrix between batteries of tests in determining the 
common factors rather than the more conventional manner of analyzing 
the factor structure of the correlations within batteries. The result permits 
consideration of those factors which are common only to both batteries. 
Although Tucker pointed out that this method requires no estimate of the 
communalities, Gibson (1960) has shown that, since certain assumptions 
made by Tucker may not hold, the communality problem must be faced. 
The Tucker method provides a statistical test for the minimum number of 
factors involved and for the calculation of the correlations between corre- 
sponding factors. Gibson (1960) pointed out that Tucker’s method assumes 
that the vector configuration of the two tests within the factor space of 
overlap contains two sets of principal axes in the same location and also 
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assumes that their associated latent roots are identical. Gibson suggested 
that the problem can be overcome by selection of such test batteries that 
there exists a parallelism between sub-batteries so that their sums of squared 
loadings on a factor cannot be too different from any other factor. Gibson 
demonstrated that failure to create such a condition may lead to imaginary 
factor loadings. 


General Computational Procedures 


Wherry (1959) outlined a method of obtaining hierarchical factor solu- 
tions without the necessity for rotation. Instead of factoring and rotating 
the factor structure at each stage, inverting and normalizing the reference 
vector correlation matrix, and then refactoring, the Wherry method, which 
begins with the multiple group method of factoring, assumes that, if all 
overlap is removed from the clusters, they will have simple structure with 
respect to each other. In a short paper Wolins (1959) presented a modifica- 
tion of the Wherry-Winer method for factoring large numbers of test items. 
The new method appears much easier to use in conjunction with tetrachoric 
correlations. 

Using the principle of maximum likelihood for the estimation of factor 
loadings when certain loadings are assumed a priori to be zero, Lawley 
(1958) concluded that factor loadings can be calculated under such an 
assumption for both the orthogonal and oblique cases. Lawley showed that 
for various hypotheses it is possible to solve numerically the maximum 
likelihood equations of estimation, but that the amount of work with 
matrices of even small order necessitates the use of an electronic computer. 

Guttman (1959) pointed out that analysis of correlation matrices by 
factor analytic techniques is justified stochastically only if the regressions 
are linear, and he has shown that in general one set of new scores, at most, 
can be found to maintain the observed rank orders. Guttman posed and 
answered the following two questions: Can real numbers be assigned to 
given qualitative categories for a given population in such a way that the 
resulting numerical variables will have linear regressions on each other? 
If so, in how many ways can this be done, and what are they? He also 
pointed out that, if the question cannot be answered, nonlinear theorie$ of 
scale analysis, latent structure analysis, or facet analysis are in order. 
In another paper Guttman (1957) presented empirical evidence of correla- 
tion matrices that conform to his radex theory. Two lists are cited, one 
representing approximate simplexes and the other approximate circum- 
plexes. Fruchter and Fleishman (1957) reported a study in which they 
attempted to determine whether the presence of spuriously high intercor- 
relations among experimentally dependent variables distorts the common- 
factor structure of a battery. Results showed that the structure of the com- 
mon factors is not greatly affected. 

In an excellent article contrasting factor analysis and cluster analysis, 
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Tryon (1958b) described the method and theory underlying both “v- 
analysis” and “‘o-analysis.” The “v-analysis” is concerned with grouping 
a minimal set of behavioral properties that are most independent and best 
predict the scores of the subjects. The “‘o-analysis” is a procedure for locat- 
ing and conceptualizing types of objects or persons. 

Madansky (1960) presented extensions of existing determinant methods 
for the solution of accounting equations in latent class analysis. McHugh 
(1958) also published a short paper outlining corrections of some of his 
earlier work on latent class analysis. He pointed out where stronger state- 
ments can be made about identifiability of structural parameters. 

Gibson (1959) set forth an excellent outline of the theoretical formula- 
tions relating factor analysis, latent structure analysis, and latent profile 
analysis, and he attempted to show how the latter two models avoid the 
difficult problems of communality estimation, rotation, and curvilinearity 
that plague conventional factor analyses. Limitations of the Jatent profile 
analysis were pointed out as being: (a) lack of a scale of measurement for 
the latent continuum and (b) definition of the number of necessary latent 
dimensions since as many as g — | dimensions would be required, where 
q is the number of latent classes. 

Maxwell (1959) recently outlined a number of statistical tests which he 
considers ought to be used by factor analysts, in view of the fact that factor 
analysis lacks the sophistication of classical statistical methods in not hav- 
ing such information as the standard error of a factor loading. Significance 
tests for an entire correlation matrix, for the comparison of two variance 
or covariance matrices, and for residual matrices were given. 

Hotelling (1957) explained that in many instances certain statistical 
procedures, such as regression analysis, multiple correlation, and multi- 
variate analysis of variance, are more appropriate techniques than factor 
analysis. In attacking the problem of dimensionality of a continuous 
multivariate population, Hotelling pointed out that the rank is equal to that 
of the sample under certain conditions, provided the number of degrees 
of freedom among subjects is greater than the number of variables. How- 
ever, when the observed score is considered to be made up of a real part 
and a random error, dimensionality can only be ascertained by obtaining 
estimates of the errors by suitable replications and the use of multivariate 
analysis rather than by factor analysis. Hotelling also suggested a method 
for comparing covariance matrices using the characteristic equation and 
distributions of the latent roots. . 

Additional References: Other references of general interest, which are 
worthy of reading particularly by those interested in empirical examples 
or by those who are only beginning their study of factor analysis, include 
the contributions of Bernyer (1957); Borgatta (1958-59); Dingman 
(1958); DuBois and Manning (1959); French (1959) ; Garside (1958) ; 
Kline (1959) ; Michael (1958) ; and Royce (1958). Somewhat more mathe- 
matically oriented articles are those of Baggaley (1960); Bernyer 
(1958) ; Demaree (1957) ; Hamilton (1958) ; Tucker (1958). 
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CHAPTER V 


Research Tools: Access to the Literature of Education 


MARGARET R. SHEVIAK and HAYNES MCMULLEN 


This CHAPTER is a sequel to “Research Tools: Library Resources” by 
Pierstorff (1957) in this Review. There have been no startling changes 
during the last three years in the bibliographical apparatus for obtaining 
access to the main body of educational literature, but- two trends are 
likely to be of considerable interest in the future. The first is the increas- 
ing amount of information available about educational systems outside 
the United States; the second is the development of machines and systems 
of searching which permit quick access to ideas and combinations of 
ideas not easily located through conventional bibliographies, indexes, or 
abstracts. 

This chapter will discuss first the newer conventional bibliographical 
and reference aids which are primarily useful in the study of education 
in the United States; then the aids which are mainly useful in inter- 
national studies; and, finally, the literature on systems for the mechanical 
indexing and searching of literature. Cumulative bibliographies which 
were discussed in Pierstorff’s article and which are continuing without 
change are not considered here. 


General Guides to Library Resources 


Winchell (1960) completed a third supplement to her annotated guide 
to general reference works, and Barton (1959) prepared a fourth revision 
of her briefer bibliographical guide to reference books. Walford (1959) 
edited a comprehensive guide to reference books and bibliographies with 
emphasis on current material and on material published in Great Britain. 
Another comprehensive and detailed bibliography of basic reference works 
prepared by Murphey (1958) is especially well adapted for general use, 
because it is arranged from the viewpoint of the nonspecialist, its ter- 
minology is nontechnical, and it offers guidance in the mechanics of 
research and in the preparation and style of research papers. 

The compilation of doctoral dissertations in the /ndex to American 
Doctoral Dissertations (1956, 1957, 1958, 1959) served the dual purpose 
of continuing Doctoral Dissertations compiled by Trotier and Harman and 
indexing dissertations abstracted in Dissertation Abstracts. The form and 
features of the previous volumes sponsored by the Association of Research 
Libraries have been retained in the main. A serious omission in the 
1958-59 Index was lack of a page reference to the place in the volume 
where a dissertation was abstracted. 
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A selective list of the major abstracting journals and bibliographies 
for each of the social sciences was compiled by Clarke (1959). Fellows 
(1957) included 167 items in his guide to periodical publications in- 
tended as an aid to the study of current developments in the social sciences. 


The General Field of Education 


Alexander and Burke’s fourth edition (1958) of their valuable aid 
to the researcher lists many changes in sources for the location of educa- 
tional information. A full index facilitates finding a reference for which 
the researcher remembers only the popular title or the author. Napier 
(1958) provided a shorter descriptive list of sources, with emphasis on 
British publications. 

The second edition of Good’s Dictionary of Education (1959) con- 
tributed to a solid base of educational vocabulary. The fifth edition of 
Mental Measurements Yearbook (Buros, 1959) covers the years 1952-58. 
The Encyclopedia of Educational Research (Harris, 1960) appeared in 
a third edition under new editorial direction, with a number of new 
contributors. Contributors to earlier editions developed new treatments 
of their topics. The results of a symposium on important aspects of 
current educational research were presented by Phi Delta Kappa (Bang- 
hart, 1960). 

Two continuing bibliographies mentioned by Pierstorff changed title 
or editorial direction. Education in Lay Magazines (National Education 
Association, 1957, 1958, 1959, 1960) ended with No. 2, 1960, and was 
continued as Magazine Report. The annual bibliography of master’s theses 
that had been compiled by Lamke and Silvey (1957, 1958) appeared 
with Silvey (1959) as sole editor. 


Special Fields in Education 


Fells (1959a) presented an annotated bibliography of books and 
periodical articles devoted to college teachers and college teaching 
methods, which included studies appearing in periodicals, as well as pub- 
lished and unpublished theses. “Index B” separately lists the doctoral dis- 
sertations in these areas. Scates and Ellis (1957, 1958, 1959) also pro- 
vided a bibliography of the same matter. 

Mezirow and Berry (1960) compiled a comprehensive guide to articles, 
government publications, pamphlets, and books in major areas of liberal 
adult education, abstracting most items. It covers publications from the 
United States, Great Britain, and Canada since the end of World War II. 

A series of annual bibliographies of reports on research (articles and 
theses) in health, physical education, and recreation was begun by Hub- 
bard and Weiss (1959, 1960). The theses were abstracted by the institu- 
tions at which they were written. Hilton and Fairchild (1960) and 
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Forrester (1958) revised earlier compilations of guidance and occupa- 
tional literature. 


Psychology 


An annotated list of reference books and a list of psychological journals 
were included in Daniel and Louttit’s guide for researchers in psychology 
(1953). The book surveys psychological literature and provides technical 
and stylistic aids to scientific reporting. An appendix includes sources 
for books, tests, apparatus, equipment, and supplies. English and English 
(1958) produced a compact dictionary of terms used in psychology, as 
well as some from mathematics, medicine, and psychoanalysis. 


Directories 


The Directory of University Research Bureaus and Institutes (1960) 
identifies and describes current research programs in institutions of 
higher learning in the United States and Canada. It includes bureaus, 
institutes, experiment stations, laboratories, and other research organiza- 
tions which are sponsored by colleges and universities, established on a 
permanent basis, and carrying on continuing research programs. Ap- 
pendixes list university presses and members of the National Council of 
University Research Administrators, and indexes provide location by 
name of institution and geographic area. 

Ash (1958) compiled a guide to special book collections with subject 
emphasis as reported by university, college, public, and special libraries 
in the United States and Canada. The Association Index (1958) records 

by author, title, and subject—directories, yearbooks, and periodicals 
listing nongovernmental associations. 


Education Outside the United States 


The most significant of recently published aids to the study of educa- 
tion outside the United States is UNESCO’s monumental volume on pri- 
mary education (1958), the second in its World Survey of Education 
series. For each country, it contains a monograph, usually complete with 
tables, charts, bibliography, and glossary, with some information about 
the past and much about the present status of education, predominantly 
but not exclusively elementary education. The two latest editions of 
UNESCO's Basic Facts and Figures (1959a, 1960a) have been in English; 
some earlier editions were in French and Spanish. About a fourth of the 
tables in the latest volume are on education. 

UNESCO was responsible for two helpful directories, one giving brief 
information about educational associations in many countries (1959c) 
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and the other giving full information about clearinghouses and docu- 
mentation centers throughout the world (1957). 

Of bibliographies of international education, the most comprehensive 
is an annotated volume begun by Heath (1957, 1958) of the Division of 
International Educational Relations in the U.S. Department of Health, 
Education, and Welfare, Office of Education (1959). UNESCO published 
a partly annotated list of texts (1960b) on research methods and, for 
each of several countries, references to review articles summarizing 
educational research, bibliographies of research, and the names of journals 
which regularly contain reports on research. Eells’s long list (5700 items) 
of American theses on education outside the United States and on the edu- 
cation of foreigners within the United States (1959a) contains no anno- 
tations but helps identify elusive sources of information. The part of this 
list about scientific and mathematical education in foreign countries was 
issued separately (1959b). UNESCO published another specialized 
bibliography which lists books, journals, and articles on technical and 
vocational education in several regions and many countries (1959b). 


Literature Searches by Mechanical Means 


Most methods and machines used by industrial firms and government 
agencies to search scientific and technical literature are completely adapt- 
able to searching educational literature. One unfamiliar with the subject 
should prepare himself with a glossary; the latest and best is Wagner’s 
(1960). No single book, journal, or index, however, suffices as an intro- 
duction to this rapidly expanding field of technology. A survey of many 
systems now in use and experiments in progress was given by papers 
presented at two symposiums and edited by Shera, Kent, and Perry (1957) 
and Boaz (1959). 

Advances in methods for the mechanical searching of literature are 
described in Scientific Information Notes (1959); 58 systems now in 
operation were described more fully in two pamphlets in the series, Non- 
conventional Technical Information Systems in Current Use, compiled 
by Berry (1958a) and by Henderson and Ripple (1959b). Experimental 
progress is described in Current Research and Development in Scientific 
Documentation compiled by Berry (1958b), by Berry and Haksteen 
(1958), and by Henderson and Ripple (1959a, 1960). The latest issue 
contains notes, averaging a page in length, about more than a hundred 
projects. Some of these operational and experimental systems are described 
more fully in such periodicals as American Documentation (1950 ———), 
which also carries abstracts of articles appearing in other journals. 

Most systems for literature retrieval involve the use of statistical 
machinery or computers. Many educational research workers are likely 
to be interested in less elaborate and less expensive systems designed to 
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index collections of 10,000 items or less. Marginal punch cards have been 
in use for many years and are adequate for bibliographical control of small 
collections of articles or reports. One of the many descriptions of applica- 
tions of punch cards for indexing, Milne and Milne’s (1959) article 
revealed errors made in designing a system which used about 4500 cards. 

Soper (1955) described an unusual system of superimposed coding 
called co-ordinate indexing. It allows greater flexibility and precision 
than most systems and is adaptable to both manual and machine use. Its 
application to the indexing of a small collection of articles was described 
by Wilkinson (1959), and its capabilities were explored in detail by 
Taube and associates (1953, 1954, 1956, 1957, 1958). Zatocoding, a rela- 
tively simple system in which a machine sorts marginal punched cards, 
was described by Shera, Kent, and Perry (1957) and in detail by Mooers 
(1956). 

Among descriptions of applications of computers and similar machines 
to literature searching, the series of articles by Faden (1959a, b, c, d) is 
helpful in summarizing information up to the programing stage. Machine 
translation, not yet of much concern in education, may become more 
consequential. Delavenay (1960) gave a unified and understandable. ac- 
count of progress to the end of 1958. The most satisfactory index for 
material in this field is the Applied Science & Technology Index (1960 
——), which covers material from 1959 to the present. 


Summary 


During the last three years, access to the literature of education has 
been facilitated by a continuing stream of new bibliographies and guides, 
including new editions of standard works. Bibliographies appeared in 
the fields of adult education and health, physical education, and recrea- 
tion. A variety of statistical and bibliographical works on international 
aspects of education have rolled from the presses of UNESCO. And, if all 
these guides prove inadequate, the researcher can turn to one of many 
mechanical aids. 
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CHAPTER VI 


Research Tools: Observing and Recording Behavior 


JOHN WITHALL 


I; APPEARS THAT, at long last, researchers have taken to heart the dictum 
credited to Kurt Lewin that there is nothing so practical as a good theory. 
Some research reported gives distinct signs of having been guided by im- 
plicit theories or models. The theories are by no means full-blown but 
seem to be in the process of development and formulation. The upshot 
is that some, though not all, of the research activity noticed here has been 
conducted within a framework that is sociopsychologically oriented and 
process oriented. 

What might be called, after Jahoda, the multiple-criterion approach 
seems to be implicit in some theories. Educational researchers appear 
to have arrived at the point with respect to instruction that Jahoda (1958) 
has arrived at with respect to the concept mental health, namely, that a 
multiple-criterion approach is needed to better understand, control, and 
predict variables in these global phenomena. The current thinking seems 
to be that these all-encompassing and relatively meaningless concepts 
have to be broken down into manageable, discrete, describable opera- 
tions or behaviors. This, in connection with instruction, necessitates the 
specifying, describing, and quantifying of the behaviors of teachers and 
learners under defined and described conditions. 

This chapter (a) offers a brief historical backdrop to current efforts 
at identifying, assessing, and quantifying teacher-learner interactions in 
the classroom; (b) identifies some of the major studies that have de- 
veloped methods and instruments for observing and recording classroom 
behaviors; and (c) indicates the directions in which present trends in 
research seem to point. 


Backdrop to Current Research 


Research on teachers and learners in the classroom has moved in 
stages. At the outset, with a nod toward learners as if they were all of 
the same mold, there was the listing of teacher traits by supervisors 
and administrators and sometimes pupils. Next came the stage of identifi- 
cation of words or phrases that teachers used which appeared to bear 
on the behavior and quality of learning. Another was that of studying 
child growth and development to better understand what went on in the 
educative process and to give some guidelines for classroom procedure. 
This tack tended to displace and cast into disrepute the tabula rasa con- 
cept in the learning relationship. There followed a sociometric-analysis 
stage, when individuals were seen as members of a social group in the 
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learning situation. Finally, the findings of social psychology were utilized, 
and researchers studied classes of pupils as groups and analyzed the 
interaction of the participants in this social milieu with particular atten- 
tion to the role and functions of both teacher and learners in the educa- 
tional process. 

Melby wrote in 1936: “One interesting outcome of the application of 
the various techniques for analyzing and describing classroom procedures 
seems to be the general disappointment with the conditions they reveal 
[which] tend to show an enormous lag between our theory and practice 
in education. . . .” This stricture seems to be as relevant today as it was 
25 years ago. We have just begun to develop some bench marks in our 
efforts to assess, measure, and make predictions about classroom learning 
and teaching, with the help of other disciplines, especially social psy- 
chology, anthropology, and sociology. These have afforded us some guides 
to identification and analysis of classroom variables that influence learn- 
ing. Some of these, which may be subsumed under the rubric of group 
influences, include the feelings and perceptions of learners and teachers; 
the roles which teachers and learneis take in the classroom; the inter- 
personal interactions of teachers with learners and learners with learners; 
and the influence of the peer culture and of social class and differing 
cultures, along with the pervasive influence of values and value systems. 

Dewey’s theories of learning and his philosophy of education are sup- 
posed to have helped educators conspicuously to conceptualize the learn- 
ing process as one in which the leaiwer plays a major role, where problem- 
solving techniques are a major vehicle of learning, and where self-guided, 
functional, pragmatic learning eventuates. If Dewey’s operationalism is 
a guiding tenet in today’s schools—and Dewey is generally held respon- 
sible for the emasculation of our educational system, in terms of lowered 
academic standards, cavalier treatment of subject-matter mastery, and de- 
cline of the mental discipline—it would take some doing to demonstrate 
that this is so in the light of what is taking place today. For the ways 
in which teachers teach and the manner in which educational researchers 
have gone about their research indicates that until quite recently there 
has been incomplete understanding of the pragmatism, operationalism, 
problem solving, and self-directed activity for which Dewey is held re- 
sponsible. 

It is astonishing and discouraging, as one examines the research of 
the past and the near-present, to discover how little attention, relatively, 
has been paid to the major variables in the teaching-learning process— 
the teacher, the learner, and teacher-learner interaction. Until recently, 
researchers have consistently concentrated on the influence of such matters 
as personality traits of teachers, spaced periods of learning as opposed 
to concentrated periods of learning, individualized versus mass instruc- 
tion, lecture versus discussion techniques in the classroom, and audio- 
visual aids and their contributions to the learning process. No one would 
have denied the fact that the teacher (as well as the learner) was an 
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important factor, but it was assumed that the teacher’s own unique impact 
and influence could be taken as a given factor, just as learners had been 
perceived as similar if not identical entities in the past. 

Social psychology gave increasing emphasis to leadership on the part 
of the teacher and to the impact of the group and its social-emotional 
climate on the roles and behaviors of teachers and learners and on the 
life pattern and achievement of each learner. These focuses of concern 
lately have been accompanied by interest on the part of educators and 
educational researchers in. the influence of group experiences and activi- 
ties on learning in the classroom and also on intergroup education and 
intercultural relations. The insights derived from group therapy were also 
related to educational procedures and problems. Sometimes they appeared 
in life-adjustment education, and most recently they emerged in the 
principle that the individual’s personal-social needs must be met before 
he can marshal his problem-solving skills and fulfill his responsibilities 
for content mastery. 


Historical Perspective 


As historical background for examination of current attempts to de- 
velop techniques and instruments for observing and recording teacher 
and learner behaviors in the classroom, the major studies since 1930 
are cited. 

One of the earliest attempts to devise a procedure for examining spe- 
cific teacher behaviors in the learning situation was Johnson’s (1935) 
study of the teacher’s verbal directions to children aged three to seven. 
She found that explicit directions or requests to individuals given in an 
unhurried, positive, and encouraging manner ensured much greater suc- 
cess in performance than vague, hurried, and discouraging verbal direc- 
tions. Olson and Wilkinson (1938) found teaching effectiveness related 
to the amount of verbal control in the classroom which could be de- 
scribed as positive and directing. The teachers who used blanket and 
generalized statements were less efficient teachers. The more able teachers 
used a larger proportion of guiding and approving statements. 

Anderson and his colleagues (1939, 1945, 1946a, b), postulating that 
the main direction of influence is from teacher to pupil, developed teacher- 
behavior categories to measure this influence objectively. Anderson identi- 
fied 26 categories with which he assessed teacher influence on pupils’ 
behaviors and differentiated teachers on the basis of the relative number 
of integrative (sympathetic, encouraging, friendly) and dominative 
(deprecating, authoritarian, unfriendly) contacts they had with children. 
Concomitant differences in children’s behavior were identified. It was 
demonstrated that children who were exposed to more integrative teacher 
behaviors showed lower frequencies of distracted and nonconforming 
behavior and significantly higher frequences of spontaneous, co-operative, 
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and self-initiated behavior. Anderson, Brewer, and Reed (1946b) said: 
“The ultimate objective of these several researches has been to produce 
measures of teachers’ classroom personalities that would have practical 
usefulness for research workers, for school administrators, and for teachers 
themselves.” 

Lippitt (1940), in experimentally controlled group situations, assessed 
the impact of democratic, autocratic, and laissez-faire leadership styles 
on the cohesiveness of small groups of boys and on their interpersonal 
and problem-solving activities. In the light of Lippitt’s work and Ander- 
son’s premise that the main direction of influence in the classroom is from 
teacher to pupils, Withall (1948, 1949) developed a seven-category in- 
strument to assess classroom verbal behaviors of the teacher in terms of 
their learner-centeredness or teacher-centeredness—that is, whether the 
inferred intent of those verbalizations was to encourage and enhance the 
learning and achievement of the pupils or enhance and strengthen the 
goals and status of the teacher. 

Flanders (1951) followed this with an investigation of the influence 
of experimentally induced anxiety, as evidenced and measured by in- 
creased pulse rate, galvanic skin response, and a graphic record of intro- 
spectively perceived and reported positive or negative feelings of the 
learners. Perkins (1951) and Glidewell (1951) pursued different and 
related facets of the same problem. Thelen, who guided and helped 
facilitate these researchers in his laboratory, has pushed further in the 
exploration of the sociopsychological variables influencing individual 
learning and group achievement. 

Cornell, Lindvall, and Saupe (1953) developed an instrument to focus 
on eight dimensions in the classroom with particular reference to teacher- 
pupil interaction. The dimensions included social organization in the 
classroom, initiative of the pupils, competency of the teacher as indicated 
by differing performances in terms of selected behaviors, and classroom 
climate as reflected in the behavior of the pupils and as shown in the 
behavior of the teacher. Thirty-two classrooms were observed, and the 
instrument significantly discriminated among them. The authors indicated 
that revisions of the instrument seemed desirable and that an inventory 
administered to students on their perceptions of the classroom situation 
may be as valid as the observation device. 

Hedlund (1953) developed an instrument to identify critical incidents 
or behaviors which enable principals, educational experts, and pupils 
to distinguish effective and ineffective teachers. Some 4600 descriptions 
of effective and ineffective behaviors by teachers were reported. They 
fell into 68 specific behavior categories. After considerable sifting, win- 
nowing, and testing, 43 behavior items emerged with strong predictive 
value, some useful for both sexes and some for one sex only. Ultimately, 
a predictive index comprising 18 of the best predictive items for each 
sex was worked out. Hedlund observed that, although his findings were 
encouraging, they needed to be cross-validated. 
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Conceptual Frameworks 


In accordance with the emergent sociopsychological, process-oriented, 
and multiple-criterion approach to the study and assessment of the teach- 
ing-learning process, Jensen (1955, 1960) identified the need for a frame- 
work of concepts that could be used to analyze classroom behaviors 
systematically. Taking the needs approach and examining learners both 
as individuals and as group members, he specified seven dimensions: 
problem solving, authority-leadership, power, friendship, personal pres- 
tige, sex, and privilege. He contended that class productivity, individual 
achievement, and member satisfaction with the class arose from the diverse 
relationships of the seven different dimensions, and he offered a promis- 
ing framework in which to assess the influence of social interactions in 
the classroom and the resultant influences on group performance and in- 
dividual learning. He underlined, as Jennings (1947) had done, the 
importance of there being a network of relationships in the learning and 
problem-solving group that ensure that both individual needs and group 
needs are satisfied so that the objective problem solving, achievement- 
learning, and related work tasks are dealt with effectively. Getzels and 
Thelen (1960) summed up prevailing opinion on a process, sociopsycho- 
logical approach to classroom learning, and described nomothetic, idio- 
graphic, and transactional styles of teaching. They identified the trans- 
actional style as striking a balance between promotion of students’ 
achieving personal goals and requirement of their mastering subject 
matter in the classroom situation. In their view, the most viable instruc- 
tional setting is one in which both the individual’s self-development goals 
and content-mastery goals are attained. They emphasize, moreover, that 
the individual’s self-needs must be met before progress can be made in 
content mastery. 

If an educational theory—that is, a systematic portrayal within an 
integrated framework of a number of variables to demonstrate the re- 
lationships among these variables, as well as their relationships with other 
factors such as knowledge, attitude, and skill outecomes—is to be formu- 
lated, then the kind of concept identification and integration which Jensen, 
Thelen, and Getzels have set forth in their recent writings seem to be the 
sine qua non of such theory development. 


Research Influenced by a Sociopsychological Rationale 


Considerable research has been done to assess the variables of the 
teaching-learning process from the vantage point of the developing socio- 
psychological framework. In such studies cognizance is taken of the 
interaction of personal-social and achievement-learning needs in the con- 
text of classroom instructional process. Flanders (1959, 1960a, b) focused 
on aspects of the transactional type of teaching—that is, on teacher be- 
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haviors which control and delimit the students’ freedom of action as con- 
trasted with those which invite and encourage activity and spontaneous 
participation by the learners. He (1960b) developed 12 categories for 
analysis of the influence pattern of the teacher, basing them on constructs 
and categories of Anderson (1939), Withall (1949), and Bales (1950). 
Flanders’s 12-category system assesses direct and indirect influence of the 
teacher’s talk, as well as the students’ talk, according to whether it is 
responding to or initiating behavior. It identifies, along with other things, 
teachers who are parsimonious with praise and encouragement and show 
little interest in the affect side of the students’ learning activities, and thus 
the system has considerable significance for the current research in the 
mental-health impact of teachers’ classroom behaviors. Flanders’s inter- 
action analysis technique affords: (a) operational definition of teacher 
behaviors and (b) a way of quantifying the behaviors that contribute 
to the dynamic interaction of the participants in the teaching-learning 
process. 

Damrin (1959) developed an instrument, under the name of the 
Russell Sage Social Relations Test, to measure the competence and skill 
of elementary-school youngsters in group planning and group work. 
The test, which involved three construction problems with miniature 
blocks, comprises a planning stage in which the group decides how to 
construct the figure and an operations stage in which the figure is 
constructed. No limit is set on the time to be used in planning, but 15 
minutes is allotted to the operations stage. During both the planning and 
the operations stages an observer keeps a record of behavior on standard 
observation sheets. Some concomitant indicators of the reliability and 
validity of the instrument have been set forth, and work is under way 
to vigorously test the technique statistically. Seven types of groups 
emerged from the planning stage: (a) mature, (b) dependent, (c) im- 
mature, (d) semicontrolled, (e) semirestrained, (f) uncontrolled, and 
(g) restrained. Nine types of groups emerged from the operations stage: 
(a) mature, (b) immature, (c) disinterested, (d) rollicking, (e) excited, 
(f) rowdy, (g) suppressed, (h) bickering, and (i) quarreling. 

The instrument looks promising in that it draws on sociopsychological 
concepts and focuses on specific, observable behaviors in terms of socio- 
psychological context. Drawbacks at this stage are lack of evidence re- 
garding its validity and reliability and the necessity for two highly trained 
and skilled individuals to administer the test and record the subjects’ 
behaviors. The former drawback is being dealt with, and the latter arises 
with the use of any worthwhile testing, observation, and recording process 
so far devised. The promise of the instrument lies partly in the fact that 
it deals with the measurement of social and psychological forces and 
gives evidence of being applicable to other age levels, including adults. 

Gold (1958), working within the framework of an interactional theory 
of leadership, examined the learning situation to determine what variables 
influence the status and roles of children in the classroom. He used the 
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concepts of power, properties (attributes or qualities), and resources 
(abilities valued by the peer group) possessed by children to analyze the 
social relationships of those youngsters in classroom groups from kinder- 
garten through grade 6. Seventeen characteristics of children deemed 
important by their peers were identified. These included such characteris- 
tics as “smart at school,” “acts friendly,” “knows how to act so people 
will like him,” and “does things for you.” Pupils were asked to assign 
these traits to others in their class. The results of the study indicated 
“. . . a relationship between the values of the children in our study, 
the properties perceived to be possessed by the children and the power 
structure of the classroom group. 

This study placed: considerable emphasis on the peer-group influences 
in the learning situation which, it cannot be denied, exerted considerable 
influence on the child’s status in the classroom. It also seems clear that 
the peer-group values of children influence their openness to learning 
content, attitudes, and skills. In many respects the pupils with power serve 
as gatekeepers to the learning of the knowledge the teacher is trying to 
impart. This emphasis on peer-group influence in the learning situation 
underlies its role as an instructional vehicle. 

Zander and Van Egmond (1958) examined the relationship of intelli- 
gence and social power (ability to get others to do things) of 418 
children in grade 2 and grade 5 classrooms. They hypothesized that the 
cultural expectations for boys to be self-reliant and to strive for achieve- 
ment were easily realized if the boy possessed either social power or 
intelligence and that society’s expectations for girls to be obedient, 
nurtural, and responsible required neither social power nor intelligence. 
Data included Kuhlman-Anderson Test scores, peers’ ratings of four 
characteristics, observed behavior in small work groups, and teachers’ 
ratings of seven social behaviors. The findings were that: (a) social power 
is not highly correlated with intelligence; (b) those with greater social 
power were better liked regardless of sex; (c) boys won social power by 
being threatening, and girls by doing well the things required of them; 
(d) boys low in social power and intelligence were like girls in their 
quiet, unassertive patterns of behavior. 

Shapiro, Biber, and Minuchin (1957) described an instrument, the 
Cartoon Situations Test, developed for the purpose of predicting teaching 
success. The dimensions assessed by the instrument include the prospec- 
tive teachers’ quality of expressive tone, orientation to dilemmas, quality 
of emotional identification with characters in the cartoon, perception of 
the authority role, quality of psychological thinking, orientation to action, 
mode of aggressive expression, and attitude toward socialization. The kind 
of affect projected into the cartoon situation seemed crucial, as did the 
absence of expression of hostility. The findings indicated that the instru- 
ment has predictive value; its use is being further explored. 

Haigh and Schmidt (1956) examined the relative effectiveness of 
teacher-centered and group-centered classes. Students were placed in 


502 








December 1960 OBSERVING AND ReEcorDING BEHAVIOR 








teacher-centered and group-centered classes according to their stated 
preferences. The group-centered class was not required to take a final 
examination. The Horrocks-Troyer Test was given to all subjects at the 
end of the experiment, which ran one academic year. There appeared 
to be no significant differences between the two groups in subject-matter 
learning. 

Maier and Maier (1957) compared the effects of two leadership- 
discussion techniques on group decision. One technique afforded free 
discussion in a permissive manner; the other entailed the leader’s break- 
ing a problem into parts and keeping all group members together in 
considering it. A significant difference was obtained in the quality of 
decisions of the two groups: twice as many of the developmental discus- 
sion group members as the free discussion group members reached a 
high-quality decision. The generalizability of these results is limited; 
the authors believe their findings applicable only to problem solving in 
which there is little or no emotional involvement. 

Calvin, Hoffman, and Hard - °957) tested the hypothesis that a per- 
missive social climate enhane hn achievement of high-IQ subjects and 


handicaps subjects with onl verage intelligence scores. Their conclu- 
sion, reached on the basis © trend occurring in all their experiments 
and not on the basis of an ‘ptable level of statistical significance, was 


that findings supported .~ -ypothesis. 

These studies on th effect of teacher-centered versus learner-centered 
group atmospheres and on permissive as opposed to structured teaching 
methodologies throw some doubt on the assumption that permissive and 
learner-centered instruction inevitably leads to better learning and 
achievement. 

Getzels and Guba (1955) used a 71-item instrument that dealt with the 
socioeconomic, civic, and professional roles of the teacher in terms of their 
situational and personal aspects. They found the teachers feeling troubled 
at the role conflicts they experienced. 

Trow (1960a, b) assessed the several functions and roles of the teacher 
and related them to the teacher’s skill in effectively implementing the roles. 
He emphasized the inescapable teacher-learner relationship of controller to 
controlled, superior to subordinate, and the central roles of the teacher in 
this context as therapist, strategist, and instructor. Trow and his colleagues 
have consistently emphasized the sociopsychological framework for the 
study of the learning process in the classroom, and have pioneered some 
of the work and findings emerging from this framework. 


A Multiple-Criterion Approach— 
Operational Definitions of Classroom Interaction 


Hughes and associates (1959), proceeding from the assumption that 
the teacher cannot speak or act in the teachez-learner situation without 
performing a function for someone in the situation, developed a code for 
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the analysis of teaching. The subjects were 21 teachers judged “good” by 
administrators and supervisors from several schools and 10 teachers 
representative of one school. Focusing on the behavior of the teacher as 
reported by two trained observers in narrative form, Hughes identified 
31 functions which the teacher fulfilled vis-a-vis specific individuals in the 
classroom. She examined the problem of each teacher-act’s having a multi- 
pronged effect on pupils in a classroom. She confined herself, however, 
to interpreting the act as performing a function only for the particular 
individual or individuals to whom the act was directly addressed. 

The 31 functions were subsumed under seven categories: (a) control- 
ling, (b) imposition of teacher, (c) facilitating, (d) developing content, 
(e) response, (f) positive affectivity, and (g) negative affectivity. Con- 
clusions reached were that there are few “good” and few “bad” teachers, 
that criteria used by administrators for judging “good” teachers are com- 
pounded of many elements and are not comparable, and that the relation- 
ship of the teacher to child reflects to a marked degree the adult-child 
relationship of our culture. Some conclusions were actually opinions and 
went beyond the data presented. 

Wright (1959), studying verbal behaviors of secondary-school mathe- 
matics teachers, used three frames of reference for assessment: ability to 
think, appreciation of mathematics, and attitude in terms of curiosity and 
initiative. If the verbalization did not fall into at least two of the frames, 
it was categorized as neutral. The instrument appears to have several 
limitations: (a) the observer must be trained in the specific subject 
matter; (b) he must be trained only by the researcher who developed 
the instrument; (c) he is required to interpret and infer as he categorizes; 
(d) large amounts of time must be devoted to the observation of each 
classroom. 

Medley and Mitzel (1958) developed the Observation Schedule and 
Record (OScAR) by modifying the classroom observational procedures 
of Withall (1948) and Cornell, Lindvall, and Saupe (1953). The observer 
records both teacher and learner behaviors under an Activity Section 
which identifies 44 possible activities of teacher and pupils. He next 
employs the Grouping Section of the instrument to identify and list large 
and small groups and to note acts of individual pupils. He then notes in 
the Materials Section the type of instructional materials used. Finally, he 
enters in the Signs Section items symptomatic of classroom climate. Dif- 
ferences between classes can be identified, it is maintained, with fewer 
than 14 variables. A study of the factorial structure of the 14 scoring keys 
indicated that the OScAR technique gives reliable information about three 
relatively discrete dimensions of classroom behavior—the social-emotional 
climate, the relative emphasis on verbal learnings, and the degree to which 
the social structure centers about the teacher. 

In a subsequent study Medley and Mitzel (1959) tried to identify the 
relationship between some measures of teacher effectiveness and some 
teacher-behavior variables. They compared pupils’ reading growth, prob- 
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lem-solving skills, pupil-teacher rapport, teachers’ self-ratings, and prin- 
cipals’ ratings with teacher behaviors associated with emotional climate, 
emphasis on verbal activities, and social organization. Their findings 
indicated that supervisors’ ratings for evaluating learning are inadequate. 
They questioned the relevance of a considerable body of research that has 
used ratings of some kind as a criterion of teacher effectiveness. They 
also found that gains in reading and gains in group problem-solving 
skills seem unrelated to recorded classroom behaviors of teachers and 
pupils. 

More studies of this sort attempting to relate learner achievements to 
identified behaviors in the classroom may dissuade researchers from em- 
ploying designs which involve ratings to assess variables. Despite the fact 
that a number of studies, such as those of Brookover (1940), Jayne (1945), 
Lins (1946), and Anderson (1954), have indicated the questionable 
validity of ratings and checklists, such ratings and lists are still used as 
criteria, as, for example, by Willard (1957) and Davidson and Lang 
(1960). 

Kowatrakul (1959) developed an instrument comprising six categories 
for studying student behaviors in the classroom. The categories are: (a) 
intent on ongoing work, (b) social-work oriented, (c) social-friendly, 
(d) momentary withdrawal, (e) intent on other academic work, and (f) 
intent on work in nonacademic area. These were used while the students 
were doing independent seatwork, watching or listening, or participating 
in a discussion. It was possible to identify and examine relationships 
between classroom activities, subject matter, and students’ behaviors. The 
study was a modest attempt to specify, define, and quantify discrete 
behaviors in the classroom under certain stated conditions. By such little 
steps a formulation of a theory of education may eventually be reached 
that will help predict and control the variables of the educative process. 

Kounin and Gump (1958) studied the behavior of kindergarten children 
as the latter watched their teacher disciplining or scolding a child for 
misbehavior. The researchers collected and analyzed 406 incidents and 
categorized the teachers’ acts in terms of clarity, firmness, or roughness. 
A child’s behavior as he watched the reprimand was listed as: (a) no 
reaction, (b) behavior disruption, (c) conformance, (d) nonconformance, 
or (e) both conformance and nonconformance. Within the framework of 
disciplinary procedures, it seemed that the ripple effect (impact of the 
teacher’s disciplinary action on the watching child) is best controlled 
by clear instructions to the child being reprimanded. On the one hand, 
firmness or lack of it did not allow reliable prediction of how the watching 
child would react; on the other hand, roughness usuailiy resulted in be- 
havior disruption in the watching child. It is interesting to note that a 
phenomenon of which all have been aware (the effect on children of 
witnessing the public disciplining of a peer) has not been more closely 
examined. This and similar projects point to development of a sound 
theory of education. 
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Cogan (1956, 1958a, b), using perceptions and judgments of pupils as 
a criterion of effective teaching, deliberately rejected as criteria both 
pupil change and the more commonly (and easily) used evidence from in- 
service ratings and experts’ opinions of the teacher’s competence. He 
categorized teacher behaviors as preclusive, inclusive, and conjunctive, 
and assessed their effects on the learners. These effects were measured in 
terms of pupil performance of required and self-initiated work having 
to do with classroom activities. Cogan examined the logic and desirability 
of using pupil change as the major criterion of teacher effectiveness, but 
rejected it in favor of reports by the pupil of having carried out required 
schoolwork and self-initiated work arising from classroom experiences. 
This looks like a direct and common-sense way of assessing teacher ef- 
fectiveness and of capitalizing on pupil judgments and perceptions which, 
to judge by earlier research, appear to have more reliability and validity 
than administrators’ or supervisors’ ratings. 

Levin, Hilton, and Leiderman (1957) offered a survey of the main 
studies of the Harvard Teacher Education Research Project. They in- 
cluded a précis of Cogan’s study and digests of the other studies, including 
an examination of authoritarianism in teaching, ego involvement in teach- 
ing, interests of teachers, bases for withdrawal from teaching, differences 
between elementary and secondary student teachers, and prediction of 
classroom behaviors of student teachers. They concluded with the under- 
statement “. . . we have discovered that prediction of teacher behavior is 
a complex task with many questions which demand further investigation.” 

Rabinowitz and Rosenbaum (1958) assessed the predictive value of 
pupil-teacher rapport by certain standardized and some experimental 
instruments. The battery of instruments included the Minnesota Teacher 
Attitude Inventory, the California I Scale, the Draw-a-Teacher Technique, 
and the Strong Vocational Interest Blank. Seventeen measures comprising 
nine test scores, seven classroom observations, and one measure of pupil- 
teacher rapport were used. The researchers reported that the tests, singly 
or in combination with one another, failed to predict subsequent pupil- 
teacher rapport, and they concluded that they did not correlate with the 
objective measure of behavior in the classroom. 

All the problems of research in this area are summed up and recorded 
by Ryans (1960) in an impressive, statistically comprehensive, and 
sophisticated manner. One of his findings confirmed the belief that 
teacher behavior in the classroom can be represented by three dichotomies 
which might be designated as friendly versus aloof, systematic versus un- 
organized, and imaginative versus uninspired. Results of the study derive, 
in the main, from use of two instruments, the Classroom Observation 
Record and the Teacher Characteristics Schedule. The former consists of 
22 bipolarities, for example, apathetic versus alert, uncertain versus 
confident, partial versus fair. Eighteen of these opposites were used to 
rate teacher behaviors, and four were used to rate pupil behaviors. The 
22 bipolarities were scaled on a seven-point scale in which the fourth 
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represented an average or neutral score. Observers noted specific be- 
haviors by teachers and pupils and estimated the extent to which one or 
the other pole was approximated by the behavior of the teacher. 

The other instrument, the Teacher Characteristics Schedule, comprises 
300 multiple-choice and checklist items of teacher attitudes and view- 
points which seemed to correlate with teacher classroom behaviors as 
rated by the observers using the Classroom Observation Record. Ryans’s 
summary points up the immensity of the task involved in devising ways 
of assessing and predicting teacher behaviors. Despite the more than 
10 years of painstaking, thorough, and concentrated effort that this work 
represents, one is appalled at the fact that we have merely begun to nibble 
at the problem. This is made clear by the fact that the not inconsiderable 
findings of all the years of work by Ryans and his colleagues can be 
summarized on two pages (360 and 361) of his book. It is interesting to 
note that the lack of clear knowledge of the patterns of behavior of 
teachers cited by Ryans is being gre Jually eliminated by work such as that 
of Medley and Mitzel (1959) and Flanders (1960b). 

At the University of Wisconsin in November 1960 four research 
projects on mental health in teacher education, supported by the National 
Institute of Mental Health, reported their efforts to describe and measure 
behavior patterns of both university instructors and public-school teachers 
in the classroom. Two working papers of the Wisconsin project, by Newell, 
Lewis, and Withall (1960) and Lewis, Withall, and Newell (1960) include 


statements on a 14-category instrument designed to describe teachers’ 


behavior in terms of their asking for or giving information and directions 
to the learners and in terms of the negative and positive affects that 
accompany these interactions. Interjudge reliabilities (rank-order corre- 
lation coefficients) between two highly trained observers in three class- 
rooms were 0.99, 0.97, and 0.98. 

The University of Texas program concerned with mental health in 
teacher education developed recording operations in the instructional 
situation to describe classroom behaviors of the teacher and verbal and 
nonverbal behaviors of the students (Harris, 1960). Observers record 
in shorthand style all observable behaviors of students and teachers. 
Students’ and instructors’ oral responses are recorded verbatim. The 
record is transcribed to a running account as soon as possible after 
observation. Techniques used to analyze the typed transcripts in process 
of development include content analysis, categorization of specific units 
of content, interaction process analysis, and adaptation of case-study 
techniques. 

The Bank Street College of Education project staff (1960) outlined 
the beginnings of a classroom observation procedure of their multi- 
faceted study of the relationship of school experiences and personality 
development. Two recorders observe intensively for one hour and a half, 
and their records are combined to give the teacher’s presentation and 
management techniques and the child’s responses. The procedure aims to 
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reveal the two levels of social-psychological functions that appear in the 
classroom, the overt and the covert. This entails not only the planned, 
recognized, or relatively formal patterns of interaction in the classroom, 
but also the teacher’s manner of indirectly structuring the children’s 
orientation toward her, each other, and their work, by analyzing her 
differential allocation of rewards and punishments, goal-setting statements 
of various kinds, and evaluative comments. 

The instrument-developing efforts of the Wisconsin, Bank Street, and 
Texas studies may help to extend our knowledge of teachers’ patterns of 
behavior. 

Bowers and Soar (1960) described an extended study of the human- 
relations skills needed by educators and the procedures that could be 
used in a three-week workshop to help teachers develop these skills. Work- 
ing with 60 elementary-school teachers divided into control and experi- 
mental groups, they collected personality and attitude data, biographical 
data, and classroom observations by means of the OScAR. The teachers 
kept a log of their own activities. The question was raised of what impact 
the intensive workshop experience should have on teacher or pupil class- 
room behavior, and pertinent data were collected. In addition, analyses 
were made of the relationship of teacher behavior to teacher self-descrip- 
tions, the characteristics of teachers who use group activities, the correlates 
of effective group membership, and the relations between measures of 
teacher effectiveness. The significance of much of this research is its 
assessment of the effectiveness of laboratory training in human relations 
in changing the behavior of human beings, in this instance, teachers and 
pupils in the classroom situation. 

Rippy (1960) reported a study of the relationships in the classroom 
between social-emotional climate, verbal emphasis, and social structure on 
the one hand, and, on the other, pupil skill in group planning and teacher 
attitudes and personality. He used Damrin’s (1959) Russell Sage Social 
Relations Test and the OScAR of Medley and Mitzel (1958) for assessing 
human relations in the classroom. To assess attitudes and personality 
characteristics of the teachers, the Bowers Teacher Opinion Inventory, 
the Minnesota Teacher Attitude Inventory, the Minnesota Multiphasic 
Personality Inventory, and the Survey of Educational Leadership Prac- 
tices were used. Fifty-four elementary-school teachers comprised the total 
population. Rippy’s conclusions were that observing specific behaviors in 
the classroom afforded criteria of teacher effectiveness, that the way 
teachers described themselves was reflected both in the teachers’ actual 
classroom behavior and in that of the pupils, and that teacher effectiveness 
is a multidimensional phenomenon. 

Repeatedly in the literature of the last 30 years, brave words are en- 
countered about the disappearance of the cleavage between cognitive and 
affective processes, the significance of personal-social needs and perceptions 
in the learning process, and trends toward reformulation of the problem 
of learning in a social-emotional context. Until recently these have 
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largely represented wishful thinking. Now there seems to be a modest 
ground swell of research activity within the context of the sociopsycho- 
logical orientation along with a modest effort to identify the behavioral 
correlates of certain instructional procedures and resultants. 


Conclusion 


Two major trends influence researchers engaged in observation of class- 
room activities. One is reflected in the studies guided by the sociopsycho- 
logical orientation set forth by Jensen (1960), Getzels and Thelen (1960), 
Gibb (1960), and Jenkins (1960). The other is seen in the attempt to 
define operationally the specific behaviors in which teachers and learners 
engage that can be hypothesized to relate significantly to group behaviors | 
and individual learning. If these two trends merge, major advances in 
the control and prediction of learner and teacher activities are possible, 
as well a’ in the development of educational theory and ultimately the 
redirection of the teacher-education process. 
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CHAPTER VII 


Research Tools: 
Instrumentation in Educational Research 


EDWARD B. FRY 


Epucationat researcn has long been dependent on the rating scale and 
the paper-and-pencil test. Current writings reveal that these tools are used 
over and over again, with a few exceptions in studies related to experi- 
mental psychology. Recently, however, attention has been drawn to new 
devices. Many of the new devices fall into a category commonly known as 
teaching machines. 


Automated Learning Graph 


Through the use of simple mechanical instrumentation, Keislar (1959) 
saw a learning curve develop. The instrument he used was a multiple- 
choice teaching machine, a converted “Navy Rater,” which presented 
“pages” containing information about rectangles, followed by a multiple- 
choice question. The pupil responded to the question by pushing one of 
several buttons. If he pushed the correct button, a new “page” was pre- 
sented; if he pushed a wrong button, nothing happened. An automatic 
recording device drew a graph. Perfect learning resulted in a vertical line, 
and errors made the pen move horizontally. Keislar’s finding was that 
14 fourth-grade and fifth-grade pupils using the instrument learned the 
material significantly better than the control group; however, the fact 
that the instrument can show exactly how the students learned at each 
step of the lesson and can graph the learning process automatically and 
instantaneously is more exciting. 

With instrumentation of this type and other types described by Keislar, 
large amounts of data can be easily collected which will show learning 
plateaus, fatigue, weaknesses in presentation, effects of supplemental 
stimuli, and other variables. 


Removal of Teacher Variable 


One of the weaknesses in educational research has been the teacher 
variable—different teachers supplying enthusiasm or some other contami- 
nant which makes the experiment difficult if not impossible to replicate. 
Instrumentation, to a large extent, can eliminate the teacher variable. 
For presenting lessons, instrumentation need not be complex. A tape 
recorder or motion picture can act as a standard instruction stimulus. 
A recent article on auditory abilities by a well-known reading specialist 
described the reading by teachers of paired words from standardized 
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tests. The amount of confusion which could enter with regional dialects 
is almost unbelievable; yet, as simple a thing as a phonograph record 
could give a standard stimulus. Luser, Stanton, and Doyle (1958) used 
recordings of 43 drill sessions in phonics to aid experimental groups. 
Evans, Glaser, and Homme (1960) developed a standard lesson to offer 
a control of that variable while other factors were varied. 


Traditional Instrumentation 


Educational researchers have been reluctant to adopt the established 
instrumentation of experimental psychology, but, with demand for more 
rigor, they will need and use more mechanical aids. As Grings (1954) 
states, “. . . specialized instrumentation . . . makes possible not only the 
extension of the range of senses but a reduction in the ‘personal equation’ 
of observation.” He classifies instrumentation as (a) behavior recording 
systems (polygraph), (b) timing and counting (clock, electronic counter), 
(c) audition (audio oscillator), (d) vision (light meter, color plate), 
(e) other senses (anasthesiometer), (f) human learning and perception 
(memory drum, stereoscope), and (g) bioelectricity (electroencephalo- 
gram and galvanic skin response). 

One ambitious doctoral candidate wired a teacher to a portable galvanic 
skin-response device and recorded her emotional reactions to the classroom 
(Goody, 1951). 

An excellent and interesting review of devices and paraphernalia used 
in problem-solving research has been done by Ray (1955). He described 
-multiple-choice apparatus, electromazes, water jar problems, coin weigh- 
ing, the two-string problem, the cut pyramid, and problem boxes. 


Discrimination 


Discrimination training is common in the psychology laboratory but 
little used with direct relationship to education. Hively (1960) developed 
a teaching machine for simple discrimination which presented a stimulus 
picture and two choice pictures beneath glass plates; the child indicated 
his choice by touching a plate. Testing reading readiness of children from 
three to five and a half years old, he found that 15 of 27 subjects could 
learn simple discrimination; then, when the stimulus was gradually 
altered until matching was required, 4 out of 13 learned the matching task. 

Hively’s experiment was not a successful use of a teaching machine, 
but it points toward use of instrumentation in educational research. Use 
of the machine evolved a new contaminant, which the author described 
as “behavior which was shaped and maintained by accidental operation 
of the apparatus.” Equipment manufacturers quickly saw a relationship 
between child-training and rat-training devices, and one company offers 
a mechanical dispenser for M & M’s candies instead of food pellets. 
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Skinner’s Disk Machine 


Skinner (1958), a leader in the field of instrumentation for teaching 
and research, sought to increase the rate of learning. His disk machine 
presents information to be learned and asks for a response, usually in 
written form. The student writes a word or phrase on a tape that appears 
in a window. Then he activates a lever that brings the correct answer into 
view and, at the same time, moves his response under a glass portion of 
the window so that it cannot be changed. By moving the lever, the student 
indicates whether or not he judges his response correct; if it is incorrect, 
the item is presented again at the completion of the lesson. 

Holland (1959) described an experiment in which 187 college sopho- 
mores used disk machines for 10 weeks in studying psychology. They 
worked through 1400 different frames in a median time of 14 hours. 
Though the experiment lacked rigorous control, 76 percent of the students 
said they felt the machine helped them in studying. Holland’s experiment 
demonstrates that machines can be used in a teaching situation. Further- 
more, an extremely important process, that of item analysis of student 
responses, was used. Heretofore, instructional materials (lectures, text- 
books) have been developed almost solely by the armchair method. 
Machines are showing that it is possible to examine rigorously the pres- 
entation of curriculum material and find the exact point at which the 
student ceases to understand. 

Some authors of teaching-machine programs have reported that the 
necessity of breaking the subject matter into the small units requiring 
responses revealed numerous gaps in established modes of presentation. 
An item analysis of machine responses positively shows these gaps. One 
model developed according to Skinner’s principles by Rheem Caliphone 
Corporation includes a device which automatically tallies incorrect re- 
sponses on the back of the tape. Thus the educational researcher simply 
needs to look at the back of the curriculum material to see where the 
errors occurred. 


Complex Devices 


One of the most elaborate devices designed for educational research 
is the Western Design Tutor (Western Design, 1960). It is an automatic 
random-access recording microfilm and motion-picture projector which 
contains 1000 or more motion-picture frames that can be presented in 
any order. By pushing a button on the control panel, the user sees a 
frame or a short segment of a motion-picture film. The type of instruc- 
tional program usually put in this device is known as a “scrambled book”; 
a paragraph of material is followed by a multiple-choice question about it. 
The student responds to the multiple-choice question by turning to the 
page (in this instance, frame) numbered to correspond to the code num- 
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ber given by his answer choice. A scrambled book can be used without 
a machine, but with the scrambled book on film in the Western Design 
Tutor a complete record of the student’s responses can be made, as well as 
of his latency. Since scrambled books can be written to permit the learner 
to be shunted through any one of several learning sequences (branching), 
depending on their apparent appropriateness, an automatic recording 
device for research is highly desirable. 

Branching refers to the student’s being sent, at certain points, on a 
remedial loop or back to an earlier point. Branching is usually involun- 
tary on the part of the student and is determined by his errors. Other 
criteria, however, could be used for branching, such as latency of response 
or the student’s conscious desire and indication that he wishes to review 
or speed up. 

The SAKI (now known as Rheem Caliphone Corporation, DIDAK 
1001) is a key-punch training device to train operators to punch cards 
by means of a 10-key keyboard similar to that of an adding machine. 
Its small circuit, similar to that of a computer, branches the rate of 
presentation according to the latency of the student’s response. 


Computers 


Rath, Anderson, and Brainerd (1959) described an IBM 650 general- 
purpose digital computer with a typewriter input-output, which has been 
used for more-elaborate branching based on individual differences in skill 
and rate. The computer also has a voluntary branching feature in which 
the student requests an easier program. It gives knowledge of results 
key by key; in other words, the student is informed of his mistake if he 
even types a wrong letter. 

The use of computers has so interested some researchers that they 
have simulated computer experiments with human beings. Using a con- 
cealed human observer instead of a mass of electronic tubes, Coulson 
and Silberman (1960) investigated three teaching-machine variables—size 
of step, mode of response, and branching. Eighty junior-college students 
divided into eight groups were taught part of the Skinner-Holland psy- 
chology course. No significant difference was found in the mode of re- 
sponse, whether multiple-choice or constructed. Small-step items required 
more time but yielded significantly higher test scores than did large-step 
items on the constructed-response subtest. Branching conditions generally 
did not show a significant difference on the criterion test, except that they 
required less time when steps were skipped. 

Investigating the same two response variables, multiple-choice and 
constructed items, Fry (1960) found that constructed-response items 
yielded significantly higher results than multiple-choice, when measured 
by a constructed-response post-test. Fry used a cardboard folder with a 
window slot to simulate a teaching machine. Both Coulson and Silberman 
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and Fry had difficulty with the multiple-choice section of the post-test, 
which failed to rate differences between the groups. Longer training or 
more complex material might have overcome this difficulty. 

Continuing the same series of investigations by means of a Bendix G-15 
computer, Silberman (1960) found preliminary results to indicate that 
effectiveness of teaching by machine is positively related to intelligence 
when only one trial is given of the material to be learned. Silberman’s 
findings conflict with some of the statements from Harvard that hold 
that teaching-machine programs tend to obscure differences between 
bright and dull students. 

All the devices so far discussed are for use by one student at a time. 
There have been several proposals for group use of computers. Ramo’s 
(1957) conception of tomorrow’s school smacks of science fiction: a com- 
puter to record a student’s attendance by his thumb print, a computer to 
record his responses to instruction, results automatically recorded in a 
master memory file. The guidance counselor could at any time procure 
a complete record of the student’s work by pushing a button. 

Bushnell and Silber (1960) described Systems Development Corpora- 
tion’s proposed group-automated teaching device, to consist of a digital 
computer with magnetic-tape storage, alpha numeric printer, random- 
access light projector with back-projection screen, and individual desks 
equipped with student-response keyboards. In addition to giving knowl- 
edge of results to the student, the computer would analyze the behavior 
of the class to determine the selection of the material to be presented. 


Language Laboratories 


Language teaching by laboratory methods has been increasingly popu- 
lar, in part as a result of the financial aid provided by the U.S. Office of 
Education and various foundations. The laboratory uses an auditory 
stimulus, such as a foreign language phrase, to be imitated and records 
the student’s response on tape. The master control panel permits recording 
of any student’s response for further analysis and research purposes. 
Motion pictures and slide projectors are also part of the mechanization 
of language teaching. The language laboratory is mainly rather an in- 
structional than a research device; but Ramo (1957) conceived its use 
as an extension of psychological theory and related it to teaching machines, 
as did Morton (1960). 


Guidance Devices 


Guidance by slide projector and audio tape has proved useful in indus- 
trial situations. Irion and Briggs (1957) described the Hughes Aircraft 
“Video-Sonics” device, which is reported by Klass (1960) to have reduced 
employee errors on an electronics assembly line by 99 percent in 10 
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months and to have increased production from 60 percent of work 
standard to 90 percent. It tells the operator exactly which act to make, 
simultaneously showing him a picture of the act. It has implications for 
all manipulative training situations. 

The effect of guidance, long known to be efficient in training situations, 
was further shown by use of the Subject-Matter Trainer, a multiple-choice 
machine also described by Irion and Briggs (1957), which presents an 
item to be matched with one of 20 answers. Primarily a research device, 
it can operate in a number of modes: for example, the student can be 
allowed to make only one error, or the student can be allowed to make 
any number of errors, and the machine will not proceed until the correct 
response is made. It proved most effective when the student read the 
question, pushed the button, and read the correct answer. 


Simpler Teaching Devices 


Not all devices are elaborate. Porter (1959) developed a write-in 
machine into which the pupil feeds by hand a sheet of duplicating paper. 
The paper is in a box so that the student cannot view it after it is fed 
into a roller. Activation of the roller exposes several lines at a time. 
The student reads the stimulus and writes an answer on the sheet. When 
the roller is activated, his response passes under glass, and at the same 
time the correct answer is shown. There are several varieties of simple 
write-in machines like this on the market. 

Porter (1959) used his device for 22 weeks of the normal 34-week 
spelling program in grades 2 and 6. Standardized achievement tests 
showed the experimental group to be significantly superior to the control 
group. Porter found no relationship between intelligence scores and 
achievement in the experimental group, but a significant relationship in 
the control group. A check on the novelty factor was comparison of 
first-half performance scores with second-half performance scores; no 
difference was observed. Porter believed the experimental group spent 
only one-fourth as much time studying as did the control group. 


Nonmachines 


Use of teaching machines prompted some researchers to apply the same 
learning principles without mechanical aids. Homme and Glaser (1959) 
offered a method called a “Programmed Text,” in which a stimulus item 
such as an incomplete statement is presented; the student responds on 
scratch paper, turns the page, and reads the answer. The pages have a 
special format of panels to save space. ~ 

Neither Eigen and Komoski (1960) nor Roe (1960) found significant 
differences in learning when identical material was presented by machines 
and programed texts. 
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A device which performs some of the functions of machine instruction 
is the tab-item, which requires the student to respond to a multiple-choice 
question by pulling a tab. Using this technique with 48 NROTC students, 
Bryan, Rigney, and Van Horn (1957) found that a student’s knowing 
why an answer is incorrect is significantly more effective than his know- 
ing simply that his answer is correct or incorrect. 


Conclusion 


Instrumentation has provided interesting vistas and pathways for the 
educational researcher, and also demonstrated that many of its principles 
can be used without the aid of mechanics. It is quite possible that, 
through research in instrumentation, educational researchers can signifi- 
cantly improve classroom instruction even if they conclude that instru- 
ments should not be used at all. 
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CHAPTER VIII 


Data Processing: Automation in Calculation 


E. WAYNE MARTIN, JR. and DALE J. HALL 


Avaitasmry of the electronic computer makes it possible currently to 
employ new methods in many areas of research. Performance of 1 million 
multiplications on a desk calculator is estimated to require about five years 
and to cost $25,000. On an early scientific computer, a _ million 
multiplications required eight minutes and cost (exclusive of pro- 
graming and input preparation) about $10. With the recent LARC com- 
puter, 1 million multiplications require eight seconds and cost about 
50 cents (Householder, 1956). Obviously it is imperative that research- 
ers examine their methods in light of the abilities of the computer. 

It should be noted that much of the information published on computers 
and their use has not appeared in educational or psychological literature 
but rather in publications specifically concerned with computers, mathe- 
matics, engineering, and business. The following selective survey is in- 
tended to guide the beginner into this broad and sometimes confusing 
area. It is not an exhaustive survey. It is presumed that the reader has 
access to the excellent Wrigley (1957) article; so the major purpose of 
this review is to note additions since 1957. 

The following topics are discussed: equipment availability, knowledge 
needed to use computers, general references, programing the computer, 
numerical analysis, statistical techniques, operations research, and mecha- 
nization of thought processes. 


Equipment Availability 


As of December 1960, approximately 4000 stored-program electronic 
computers were in use in the United States. A wide variety of equipment 
is also available: desk-size engineering computers that plug into a wall 
outlet; many varieties of data processors with fast input, output, and 
access to large files of information; building-block machines whose con- 
figurations can be tailored to a variety of capacity requirements; the huge 
LARC of Remington Rand and IBM’s STRETCH, machines that, pressing 
present technology to its limit, are capable of 1 million calculations per 
second. The use of solid-state elements, such as magnetic cores and tran- 
sistors, has reduced the size and power requirements of the newer equip- 
ment and has markedly improved reliability. 

Keenan (1960) presented the results of a survey of equipment, staff 
financing, and courses offered in connection with 100 university comput- 
ing centers. The survey revealed that computers at colleges and universities 
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are already numerous and that soon almost any college will be able to 
obtain an obsolete but perfectly serviceable machine at little expense. 
In industry, most computers are used on a one-shift or two-shift basis, and 
it is not difficult to obtain machine time for research projects at nominal 
cost. . 

Although computers are available for the researcher, there is one vital 
shortage—trained people. Effective utilization of many computers is de- 
pendent on the ability of the researcher to use the machine himself. Thus 
it is important that the researcher be able to use the computer, for not 
only can he more easily make use of machines that are available, but he is 
also likely to obtain a better solution to his problem. More important, 
he is then equipped to use the computer to solve a larger and possibly 
more important problem. 


Knowledge Needed To Use Computers 


Several steps are necessary when a computer is used to solve a problem: 
(a) The problem must be defined in logical or mathematical terms. 
(b) This logical or mathematical formulation must be translated into 
an arithmetical procedure. (The translation from a mathematical state- 
ment into an arithmetical procedure is the subject matter of numerical 
analysis.) (c) An explicit series of instructions to the computer (the 
program) must be prepared to direct the computer through each step 
necessary to solve the problem. (d) The input data must be recorded in 
a form which the machine can read. (Readable media are punched cards. 
punched paper tape, and digital magnetic tape.) (e) Finally, the problem 
must be run—and the computer produces answers. 

The following sections of this survey present information about these 
various steps. Frequently used mathematical and statistical models are 
presented in the sections titled “Statistical Techniques” and “Operations 
Research”; and a section titled “Numerical Analysis” is included. Tech- 
niques explained in the section titled “Programing the Computer” show 
how the researcher may use a machine without knowing all its technicali- 
ties. Punched-card input can be prepared by mark sensing or key punch- 
ing, as described in the “General References” section. If many data are to 
be recorded from experimental equipment, it may be desirable to investi- 
gate the possibility of direct analogue-to-digital recording as discussed 
by Klein (1958) and Young (1960a). 


General References 


A number of excellent books about computers have been published 
recently. Andree (1958) discussed the programing of the IBM 650, and 
Wrubel (1959) used the IBM 650 as a vehicle to present techniques of 
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programing scientific problems. McCracken, Weiss, and Lee (1959) pre- 
sented the programing of computers for data processing hypothetically. A 
comprehensive discussion of electronic data-processing machines and their 
use was presented in Gregory and Van Horn’s (1960) book. Gille Asso- 
ciates (1961) compiled an equipment encyclopedia that describes various 
types of computers. 

Sperry Rand (1959) produced An Annotated Bibliography. Among the 
periodicals of general interest not mentioned in Wrigley (1957) are 
Computers and Automation, Control Engineering, and Datamation. Com- 
puters and Automation periodically publishes a “Roster of Organizations 
in the Computer Field” and a “Who’s Who in the Computer Field.” 
Descriptive manuals are available from computer manufacturers, but most 
are written for reference purposes and are not easily read by the neophyte. 


Programing the Computer 


As in all problem solving, a precise statement of the factors and their 
mathematical or logical relations must first be prepared before the com- 
puter can be programed. With this statement, the researcher is ready to 
make the first move toward solution. 


Program Library 


If the solution of the problem requires use of a standard mathematical 
technique, the first step should always be a search for a program that 
may already be available to solve the problem with the computer to be 
used. Owing to the cost of writing computer programs, most manufac- 
turers supply a number of basic programs and encourage the exchange 
of general-interest programs among their users. Many groups have organ- 
ized for this exchange purpose, including SHARE (The Society To Help 
Avoid Repetitive Effort) for the IBM 704, 709, and 7090 computers; 
USE (UNIVAC Scientific Exchange) for the UNIVAC 1103A and 1105 
computers: and CUE (Computer Users Exchange) for the Burroughs Data- 
tron 220. Information regarding these organizations and their member- 
ship may be obtained from their secretaries.* 

University computing centers also provide an excellent source of refer- 
ence for library programs and have available some information on re- 
search in other universities. Many university computing centers distribute 
to other centers annual reports describing current research projects. 

When deciding whether or not he can make use of a library program, 
the researcher must carefully examine the written description of the 
program to determine the following: (a) Does the program use the ap- 
x pe for SHARE is Henry A. McCabe, Electronic Data Processing Department, Union Carbide 
Corporation, 270 Park Avenue, New York. The secretary for USE is J. W. Nikitas, 315 Park Avenue, 


Seuth, New York. The secretary for CUE is Robert Gordon, Director of Data Processing, Stanford Uni- 
versity, Stanford, California. 
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propriate mathematical method? Is the proper numerical technique used? 
(b) Are the characteristics of the problem at hand within the limitations 
imposed by the program? (For example, a program for finding the 
inverse of a matrix whose order does not exceed 30 cannot be used 
with a matrix of order 40.) (c) In what form must the information be 
presented to the computer? (d) Is the equipment configuration required 
by the program available on the machine? (e) Is the program available, 
along with a detailed description of its use? 

When the program-library approach to the problem fails, the researcher 
must look to other methods. Early in the use of computers, it was realized 
that coding in machine language was difficult for the neophyte. In order 
to simplify this task, automatic programing techniques were developed 
which allow problems to be stated in a language which is more convenient 
for the researcher and which can be translated by the computer to pre- 
pare a program of machine instructions. 


Interpretive Systems 


Among the first approaches to automatic programing were the inter- 
pretive systems, in which the pseudo instructions of the programing lan- 
guage were stored in the memory of the computer, along with a program 
that translated these pseudo instructions into the proper sequence of 
machine instructions as the computer engaged in the process of solution. 

The most widely known general-purpose interpretive system for the 
IBM 650 is “Bell Telephone Laboratories Interpretive Code,” which has 
been described by Wolontis (1956), Andree (1958), and Wrubel (1959). 
Other systems were developed for special purposes, such as the University 
of Michigan “MITLAC” (1955), which included differential equation 
operations, and “SIS” (Haynam, 1957), which is designed for the solu- 
tion of routine statistical problems. Frequently one of the existing inter- 
pretive systems will lend itself to the solution of the problem at hand; 
however, since time is required to perform the translation of each program 
run, this convenience must be paid for in terms of computer execution 
time. 


Compiler Systems 


A compiler is a translating program written for a particular computer 
which accepts a form of mathematical or logical statement as input and 
produces as output a machine-language program to obtain the results. 
Since the translation must be made only once, the time required to 
repeatedly run a program is less for a compiler than for an interpretive 
system. And since the full power of the computer can be devoted to the 
translating process, the compiler can use a language that closely resembles 
mathematics or English, whereas the interpretive languages must resemble 
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computer instructions. The first compiling program required about 
20 man-years to create, but use of compilers is so widely accepted today 
that major computer manufacturers feel obligated to supply such a system 
with their new computers on installation. 

Compilers, like the interpretive systems, reflect the needs of various 
types of users. For example, the IBM computers use “FORTRAN” (Inter- 
national Business Machines, 1957, 1958, 1959a) for scientific program- 
ing and “9 PAC” (International Business Machines, 1960c) and “Com 
Tran” (International Business Machines, 1960b) for commercial data 
processing; the Sperry Rand computers use “Math-Matic” (Sperry Rand, 
1958b) for scientific programing and “Flow-Matic” (Sperry Rand, 1958a) 
for commercial data processing; Burroughs provides “FORTOCOM” 
(Turner and Waychoff, 1960) for scientific programing and “BLESSED 
220” (Burroughs Corporation, 1960) for commercial data processing. 
There is some interest in the use of “COBOL” as a translation system 
common to all computers (International Business Machines, 1960a; Sperry 
Rand, 1960a, c; Radio Corporation of America, 1960). 


Assembly Systems 


Sometimes there is no recourse but to work in the computer’s own lan- 
guage. This implies a good knowledge of the physical operations of the 
computer and their application to the problem at hand. 

Assembly systems do not remove these requirements, but they make 
the task easier. They provide an easier form of expressing the operations 
to be performed upon the factors in memory. This is accomplished by a 
simple form of translating program which reads alphabetical abbrevia- 
tions for the operations codes and symbolic names for memory locations 
and translates them into the numerical language of the computer. Usually 
there is a one-to-one correspondence between the steps of a symbolic 
program operated on by the assembler and the machine-language program 
it produces. 

Assembly languages must conform to the design of the computer. The 
IBM manual for “SOAP II” (Symbolic Optimal Assembly Program writ- 
ten for the IBM 650 Magnetic Drum Computer) (International Business 
Machines, 1959b) presents an excellent example of the special nature of 
assembly systems. The needs for assembly systems are recognized by 
computer manufacturers and are considered a part of the tools supplied 
by them. 


Numerical Analysis 


Most of the groundwork for traditional numerical analysis was laid by 
Newton and his contemporaries in the eighteenth century. The early 
numerical techniques were developed by traditional mathematical meth- 
ods. The advent of the modern computing machine precipitated a revolu- 
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tion in numerical methods, for the rapid acceptance of high-speed com- 
puters far exceeded the rate of development by traditional methods of the 
numerical techniques necessary. Accordingly, formal mathematical de- 
velopments frequently gave way to hunches and modifications of proved 
methods, and much of the development in numerical analysis within the 
past 10 years came about as the result of modification and elaboration 
of traditional methods, with little concern for error analysis. 

Research in numerical analysis today is chiefly concerned with rectify- 
ing its developmental shortcomings. Much of the research deals with error 
analysis and stability conditions of all types of numerical methods. The 
most complete and up-to-date information in this area is to be found in 
the professional journals. The major sources are Computers and Auto- 
mation, IRE Transactions on Electronic Computers, Journal of the 
Association of Computing Machinery, and the Journal of the Society for 
Industrial and Applied Mathematics. Valuable articles also appear fre- 
quently in journals associated with fields in which computers are com- 
monly used, such as physics, chemistry, astronomy, and psychology. 

Today’s textbooks in numerical methods concern themselves with spe- 
cific areas, for example, Richtmyer’s (1957) book on difference techniques 
in physical problems. Ralston and Wilf (1960) text appears to be one of 
the few available in the general area of numerical analysis that is specif- 
ically concerned with modern computing techniques. The recent Hand- 
book of Automation edited by Grabbe, Ramo, and Wooldridge (1959) 
provides an excellent reference for modern methods. 

There has been much study of linear systems. and modern algebra saw 
much activity during the past three years. More than 130 computer 
programs involving linear algebra are available for the IBM 704, IBM 
709, and IBM 7090. McKay (1957) described a special “Matrix Math 
Compiler” for the Remington Rand UNIVAC I. Faddeeva (1959) pro- 
vided a valuable supplement to standard texts on linear algebra. 


Statistical Techniques 


Fortunately, computers have been in existence long endugh so that 
many programs necessary for routine data reduction exist. A bibliography 
of statistical programs is beyond the scope of this review. The present 
purpose, therefore, is to inform the reader where such information can be 
obtained and to discuss the area generally. 

Hamblen (1959) presented a compilation of abstracts of statistical 
programs for the IBM 650. He described 103 programs: 13 experimental- 
design, 35 correlation and multiple-regression, 6 factor-analysis, 7 curve- 
fitting and surface-fitting, 6 time-series and frequency-table, 10 nonpara- 
metric-statistics, and 26 random-numbers and miscellaneous. These are 
only a few, relatively, of the statistical programs available for one com- 
puter. 
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Michael (1959, 1960) edited a section of Educational and Psychological 
Measurement devoted to programing and procedures. In it Iker reported 
on computation by IBM 650 of group differences and means (1960a) and 
of item analysis using either a continuous (1960b) or a dichotomous 
criterion variable (1960c). Gaddis (1959) discussed questionnaire analy- 
sis. Kamman and others (1959) described a follow-up of work on test 
scoring by means of accounting machines. Madden (1959) used an IBM 
709 for efficient test-battery analysis. Multivariate and a variety of factor- 
analysis applications were presented by Kaiser (1959, 1960) and by Horst, 
Dvorak, and Wright (1960). 

The Statistical Laboratory of Case Institute of Technology began in the 
fall of 1960 to compile a “Bibliography of Statistical Computer Routines,” 
which when completed should provide a useful tool for the researcher. 
Information on computer programs in this area frequently appears in 
professional journals such as Psychometrika, Educational and Psychologi- 
cal Measurements, Journal of Experimental Psychology, Journal of the 
American Statistical Association, and Behavioral Science. 


Operations Research 


Operations research is the application of the scientific method to manage- 
ment problems in organizations. Many ideas and techniques developed in 
the operations research literature may have value for educational re- 
searchers. An excellent bibliography of the field of operations research 
was prepared by the Case Institute of Technology Operations Research 
Group (1958), Among recent books on the subject are those by Church- 
man, Ackoff, and Arnoff (1957), Saaty (1959), and Sasieni, Yaspan, and 
Friedman (1959). Periodicals devoted to this subject include Operations 
Research and Management Science. 

Linear programing is a mathematical technique for maximization or 
minimization of a linear function subject to a number of linear restrictions. 
Riley and Gass (1958) prepared a bibliography of linear programing. 
Among the many good books on the subject are those by Stockton (1960), 
Gass (1958), Ferguson and Sargent (1958), and Dorfman, Samuelson, 
and Solow (1958). Stockton’s presentation is elementary and serviceable 
to those without mathematical background. Orchard-Hays (1958) de- 
scribed several computer programs for solving linear programing prob- 
lems, and Shetty (1959) discussed the effect of changes or inaccuracies in 
the coefficients in a linear programing problem. 

The theory of games concerns itself with conflict situations. Luce and 
Raiffa (1957) presented an excellent over-all survey of game theory and 
its significance in the social and behavioral sciences. Flood (1958) pro- 
vided a game-theoretic discussion of several common conflict situations. 
Elisberg (1956) gave an interesting critique of game theory. 

Queuing (or waiting line) theory is concerned with problems in which 
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services are provided to customers who arrive in a random manner and 
wait in line to receive the service. The objective of the theory is to mini- 
mize the total of the cost of providing the service and the cost of 
customer waiting. Morse (1958) devoted a book to queuing theory and 
its applications, and Shelton (1960) presented a compilation of several 
formulas that have been developed for various types of queuing situations. 

One of the most useful tools for the analysis of large and complex 
problems is simulation, by means of which a model of the situation under 
investigation is operated (usually by a computer) through succeeding 
intervals of time in order to evaluate performance under assumed condi- 
tions. Malcolm (1958) presented a number of examples of simulation 
and its use, and Martin (1959a) described several large simulation studies. 
A bibliography of simulation and its use was presented by Malcolm 
(1960). The industrial dynamics variety of simulation was discussed by 
Forrester (1958, 1959), and a simulation of the shoe industry was de- 
scribed by Cohen (1960). Enke (1958) described a large simulation in 
which human decision makers were integrated into a computer simulation 
in a situation too complicated to be handled by either the persons or the 
computer alone. Conway, Johnson, and Maxwell (1958) and the IBM’s 
Job Shop Simulation Application (1960d) described general-purpose 
computer programs for simulation of job-shop dispatching operations. 
Davis (1959) discussed an automatic programing system designed for 
writing simulation programs. 

A variation of the simulation technique called decision gaming (or 
management gaming, management decision simulation) was widely 
used for training and for research into various aspects of decision making. 
Several of these simulation exercises and their use were discussed by Bell- 
man (1958), Martin (1959b), IBM’s Management Decision-Making La- 
boratory (undated), and Sperry Rand’s Marketing Management Simula- 
tion (1960b). The University of Kansas (1959) published the proceedings 
of a symposium devoted to discussion of various exercises of the same 
kind and the points of view of several people concerning their use. 
Guetzkow (1959) described experiments with the use of noncomputer 
simulation exercises in the area of international relations. 


Mechanization of Thought Processes 


A great deal of research effort is currently devoted to the possibility 
of the use of computers or the design of more advanced machines to 
perform in a manner that resembles the functioning of the human brain in 
certain respects. Much of this research has used the digital computer as an 
indispensable research tool. A National Physical Laboratory (1959) publi- 
cation included a number of papers on this subject, and Behavioral 
Science reported a good deal of research. Young (1960a, b) and Uhr 
(1959) presented general surveys of the work in this field. 
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One basic approach is that of devising and (through computer simula- 
tion) testing theories of how the neurons of the brain interact with one 
another in thought processes. Reiss (1960) provided an excellent introduc- 
tion to this approach. Another tack, the simulation of human methodologi- 
cal approaches to problem-solving activity, was taken by Gelernter and 
Rochester (1958); Newell, Shaw, and Simon (1958, 1959); Friedberg 
(1958); Simon and Newell (1958) ; and Hagensick (1960). Green (1960) 
reported on an automatic programing language devised to simplify re- 
search in this area. 

The important problem of machine retrieval of information from (some- 
times specialized) libraries has received much attention over the last few 
years. Vandenberg (1960) and Ledley and Lusted (1960) reported on the 
status of the use of computers for medical information retrieval and 
diagnosis. Similar projects in chemistry, law, and other fields were briefly 
reported in Computers and Automation (1960a). Discussions of some of 
the concepts involved in information retrieval may be found in Bourne 
and Engelbart (1958) and Luhn (1957). 

Machine translation of languages was discussed by Blickstein (1960) 
and MacDonald (1960), and the outline of a recent conference was re- 
ported by Computers and Automation (1960b) in “National Symposium 
on Machine Translation.” Coulson and Silberman (1960) described the 
use of a computer as a component of a sophisticated teaching machine. 


Summary 


This review has surveyed research on computers since Wrigley’s 1957 
article. During this period the number of computers in existence has 
increased to an extent that makes them available for research. Programing 
techniques have significantly advanced, and mechanisms have been estab- 
lished for the interchange of programs of general interest. Courses in 
programing for beginners are available to faculty members at most in- 
stitutions that have computing centers. 

Significant progress has been made in the use of computers in processing 
data, in computation, and in the non-numerical areas of simulation of 
intelligent behavior. It is certain that use of computers in educational 
research will increase greatly throughout the next several years. 
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Elementary-school teachers: training in 
liberal arts, 239 

Emotionally disturbed: and counseling, 
132; interests and occupational choices 
of, 143; theories of vocational choices 
of, 149 

Empathy: relation to aesthetic sensitivity, 
135 


Empirical viewpoint: as applied to guid- 
ance, 98 

Employee attitudes: research on, 148 

Employment, college student: effects of, 
on achievement, 363 


Employment conditions: of secondary- 
‘school teachers, 76 
Engineering: accrediting problems of, 


373; curriculums and general educa- 
tion, 334; education, study of, 341, 376 

English, classroom for teaching: need of, 
263 


English curriculum: in core classes, 235; 
high-school, changes in, 235 

English fundamentals: teaching by tele- 
vision, 323 

English instruction: goals of, 29 

Enrichment, inclass: utilization of, 265 

Environment, college: studies of, 311 

Error components, correlated: analysis of 
variance for, 434 

Estimation: relation to hypothesis testing, 
447 

Estimation of points: discussion of, 446 
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Estimation with censored samples: discus- 
sion of, 446 

Estimation with order statistics: discus- 
sion of, 447 

Ethical judgment, rational 
for: discussion of, 423 

Ethics, professional: of the counselor, 125 

Ethnic environments: effect on personal- 
ity, 15 

Ethnocentrism: 
292, 325 

European and American children: com- 
parison in arithmetic, 203 

Evaluation: of guidance and personnel 
services, 168; of learning, trends in, 52; 
of nonclass activities, 63; problems of, 
in studying colleges, 371; of teacher 
success, 78; as topic in philosophy of 
science, 422; of training for guidance 
and personnel workers, 121 

Evaluation of education: pre- and post- 
Sputnik, 203 

Evening-college students: performance in 
liberal-arts courses, 335 

Evidence: discussion of, in context of 
philosophy of science, 426 

Example-setting: effects of, 248 

Expenditure per pupil: and quality of 
education, 212 

Expenses, student: rise of, 390 

Experiment, single-factor: linear 
quadratic effects in, 432 

Experimental design: discussion of, 430; 
textbooks in, 441 

Experiments: moral issues in, 422 

Experiments with repeated measure- 
ments: discussion of, 431 

Explanation: discussion of, in philosophy 
of science, 424 

Extracurricular activities: discussion of, 
oi 


foundation 


effect of college upon, 


and 


F distribution: effect of non-normality of 
population on, 435, 446; effect of skew- 
ness on, 435 

Factor analysis: communality estimation 
in, 463; computational procedures, 469; 
discussion of, 463; more appropriate 
techniques than, 470; rotation methods 
in, 466 

Factorial experiments: discussion of, 431; 
multiplicative model for, 433 

Factors from factor analysis: comparison 
across studies, 468; number extracted, 
and relation to communality problem, 
464 

Faculty: real income of, 390; shortage of, 
at college level, 390 

Faculty-administration relationships, col- 
lege: study of, 374 
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Faculty morale: and excellence in higher 
education, 378 

Faculty personnel policies: of 11 state 
universities, 388; study of, 374 

Faculty of professional schools: views re- 
garding liberal education, 377 

Faculty subculture, college: studies of, 
314 

Faculty support: of nonclass activities, 61 

Family interaction patterns: and mental 
illness, 217 

Family life, changes in: 
curriculum, 209 

Family-life courses: usefulness of, 210 

Family norms: importance of, 6 

Family relationships: of adolescents, 7, 14 


influence on 


Father-daughter relationship: investiga- 
tion of, 7 
Federal aid to education: history of 


struggle over, 200 

Federal departments’ education programs: 
study of, 340 

Federal sponsorship of research: ques- 
tions related to, 341, 391 

Feedback: properties of, 461 

Fellowships, graduate: information 
363; national survey of, 341 

Feminine identity: and lack of commit- 
ment to intellectual goals, 328 

Fiducial intervals: distinguished 
confidence intervals, 444 

Field trips: effectiveness of, in occupa- 
tional choice, 161 

Field services vs. research: allocation of 
resources, 41] 

Finances, college: analysis of, in 60 insti- 
tutions, 392; effect on accreditation, 
373 

Financial aid, college student: studies of, 
363 

Financing of higher education: contro- 
versy over, 390; effort of states in, 391: 
private support of, 392; studies of, 390 

Finite populations: estimating the mean 
of, 446; sampling of, 445 

Fiscal controls, state level: and efficient 
institutional management, 386 

Fisher’s z conversion: substitute for, 463 

Fisher-Yates index: discussion of, 458 

Food-services facilities, college: 
of, 362 

Foreign educational practices: impact on 
American education, 202 

Foreign students: educational programs 
for, 365; follow-up of college training 
of, 365 

Foreign study: enrollment trends in, 289 

Foreigners, education of: bibliography of 
studies of, 490 

Foundations, philanthropic: activities of, 
226; origins, policies and roles of, 392 


on, 


from 


studies 
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Fraternity: acceptance of university ad- 
ministration after group discussion, 
164; as an influential reference group, 
312; effects of, 362 


Funds: management of nonclass, 59 


Gains, 
378 

Galvanic skin response: as measure of 
anxiety, 499; to measure teacher re- 
actions in classroom, 514 

Game theory: book on, 441; discussion 
of, 528 

General Aptitude Test Battery: and coun- 
selor effectiveness, 118 

General education: acceptance of, by stu- 
dents, 375; achievement of objectives 
of, 335; bibliographies on, 336; college 
undergraduate programs of, 237; effect 
on student interests, 329; goals of, in 
secondary education, 25; at high-school 
level, 235; important developments in, 
374; science courses, growth of, 36; in 
technical curriculums, 388; training of 
instructors for, 336 

General Educational Development Test: 
analysis of, 144; comparison of, with 
other mental ability tests, 144 

Geography curriculum: description of, 
237; new approaches to, 193 ~ 

Gifted: education of, 34; goals for educat- 
ing, 31; instruction of, 49, 52; provi- 
sion for, 264, 265; views of, on growing 
up, 6 

Gifted student, college level: and college 
attendance, 289; and study habits, 325; 
studies of, 292 

Goals of education: perception by stu- 
dents and faculty, 291 

Goals of secondary education: comprehen- 
sive list of, 25; discussion of, 23 

Goal-setting: observation of, in teaching, 
508 

Goodenough Draw-a-Man Test: and In- 
dian youth, 14 

Goodness-of-fit: tests of, 452, 456 

Governing boards of colleges and universi- 
ties: study of, 388 

Government: relationship to higher educa- 
tion, 385 

Grades: effect of anxiety over, 251; effect 
of removal of, 231; relation of average 
level of to teacher-pupil rapport, 249; 
relation to student status and roles, 215 

Grading, co-operative: compared with 
competitive, 352 

Graduate education: studies of, 339; 
studies of, as a national resource, 340 

Graduate programs: in guidance and 
personnel work, 119 

Graduate schools: enrollment trends in, 


measurement of: in evaluation, 


Graduate students: distribution among 
subject-matter fields, 288 

Graduate training: effect of federal sup- 
port of, on quality, 391; seniors’ plans 
for, 288 

Graduates, college: nature of, determined 
by initial selection, 372; proportion of, 
to initial enrollees, 321 

Graduation requirements, high-school: of 
states, 36 

Group-automated teaching device: discus- 
sion of, 517 

Group-centered classes: effectiveness of, 


Group counseling: research on, 159 

Group life: influence of changes in, on 
curriculum, 211 

Group planning, pupil skill in: relation 
to other classroom variables, 508 

Group procedures: use of, in guidance 
work, 158 

Group psychétherapy: in counselor ednca- 
tion, 119; research in, 164; use of, 252 

Grouping: effects of flexible plan, 264; 
heterogeneous vs. homogeneous, 264 

Grouping, ability: effects of, 49 

Groups, observing and recording behavior 
of: discussion of, 496 

Growth: of adolescents, 13; 
problems of studying, 330 

Guidance: definition of, 102; differenti- 
ated from counseling, 131; in elemen- 
tary grades, 153; in nonclass activities, 
60; philosophical foundations of, 97; 
relationship to education, 101; role of 
classroom teacher in, 110; theoretical 
framework of, 100, 132 

Guidance and occupational 
compilations of, 489 

Guidance and personnel services: evalua- 
tion of, 168; studies of college,’ 171 

Guidance directors: functions of, 106 

Guidance office: relation to administra- 
tive office, 108 

Guidance workers: duties of, 117; sources 
of, 116; supply and demand, 124 


technical 


literature: 


Health, physical education, and recrea- 
tion: bibliographies of reports on re- 
search in, 

Health services, college: studies of, 363 

Heredity and environment: influence on 
vocational choice, 148 

Heterogeneity: statistical tests of, 448 

High-ability college students: studies of, 
292 


Higher education: controversy over stu- 
dent financial support of, 390; curricu- 
lum decisions in, 237; economics of, 
390; general criticism of, 379; history 
of administrative aspect of, 386; ref- 
erence works on, 392 
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High-school grade record: methods for 
increasing its correlation with college 
grade record, 302; use of, in predicting 
college success, 301 

History: of accreditation, 373; of adminis- 
trative aspects of higher education, 386 

History of science: teaching of, to under- 
stand the scientific age, 

Homeostasis: properties of, 461 

Homogeneity: statistical tests of, 448 

Homogeneous grouping: relation to gain 
in achievement of authoritarian in, 357 

Hotelling’s 7*: alternative to, 449; ex- 
pressions for the moments of, 459 

Housing: of junior high schools, 68 

Housing, college: studies of, 361 

Human-relations skills needed by edu- 
cators: study of, 508 

Human resources: and financing of higher 
education, 393 

Hypothesis testing: discussion of statis- 
tics for, 447 


Ideal students: faculty and student con- 
ceptions of, 315 

Images of college: by students and par- 
ents, 315 

Imagination: nature of, 218 

Impulse control: types of college stu- 
dents’, 327 

Impulses, readiness to express: scale to 
measure, 328 

Independent samples: tests for comparing, 
456 

Independent study: accomplished through 
project method of teaching, 353; re- 
sults of use of, 353, 356 

Indexing: punch cards for, 491 

Indexing, co-ordinate: system of, 491 

Information: achievement of, as a goal of 
college, 323 

Information-giving, by 
ment to assess, 507 

Information retrieval: by machine scan- 
ning, 490 

Information theory: as basis for test of 
independence, 451; book on, 441 

Inservice counselor training: evaluation 
of, 122, 173 

Inservice education: and curriculum de- 
velopment, 259; discussion of, 269; 
problem areas for study, effects of, 261; 
techniques of, 259 

Institutional atmosphere: studies of, 311 

Institutional productivity: relation to 
College Characteristics Index, 314 

Institutional research: bibliography of, in 
graduate and professional education, 
340; findings from a program of, 378 

Institutions: description of characteristics 
of, 316; educational, evaluations of, 
371; and relation to personal needs, 317 
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teacher: instru- 


Instruction: improvement of, 351; of large 
groups, 250 

Instructional materials: equitable staff ase 
of, 259; evaluation and selection of, 
263; methods of distribution of, 263; 
preparation for use of, 263 

Instructional materials center: 
tions for, 263 

Instructional procedures: research on, 
246; in secondary school, 49 

Instructors: differential effectiveness of, 
with various teaching methods, 357 

Instrumentation: classifications of, 514; 
in educational research, 513 

Integration: of educational experiences, 
52; of general or liberal education, 335 

Integrative teacher behavior: effect on 
children, 498 

Intellective predictors of college academic 
success: effectiveness of, 300 

Intellectual abilities: achievement of, as 
a goal of college, 323 

Intellectual ability: and college attend- ° 
ance, 289; of students, selectivity of 
colleges in, 290 

Intellectual component: increased empha- 
sis on, in high school, 30 : 

Intelligence: and behavior control, 14; 
growth of, 14; of Indian youth, 14; 
relation to effectiveness of teaching by 
machine, 517; relation to effectiveness 
of teaching methods, 356; relation to 
social power, 502 

Intelligence tests, group: use in college 
prediction, 300 

Intelligence vs. achievement: and teach- 
ing by teaching machines, 518 

Interaction process analysis: assessment 
of, 507 

Intercultural experience: effect on atti- 
tudes, 17 

Intercultural patterns: curriculum and 
instructional implications of, 201 

Interest inventories: educational, 119; em- 
ployment in prediction studies, 143 

Interests: stability of, with knowledge of 
aptitude, 142 

Interests of college students: change in 
pattern during college, 329; modifica- 
tion by general education, 329; relation 
of change in, to likelihood of gradua- 
tion, 321 

Interests of teachers: studies of, 506 

International education: bibliegraphies of, 
490 

International relations: computer simula- 
tion exercises in, 529 

Internships: fifth-year programs, 238 

Interpretive systems for computers: dis- 
cussion of, 525 

Interval estimation: discussion of, 444; in 
the binomial model, 453 
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Interview: as technique of accrediting 
agency, 374; teacher-student, effect of, 
on achievement, 357 

Interviews, group: as instrument in selec- 
tion of counselors, 119 

Intramural sports program: based on stu- 
dent need, 362 

Inventory of adjustment and values: as 
criterion of effect of group discussion, 
163 

Inventory of Beliefs: institutional charac- 
teristics and score changes on, 312; 
relation to other college student charac- 
teristics, 325 

lowa Tests of Basic Skills: use to com- 
pare Dutch and American children, 204 

Item analysis: electronic computer method 
for, 528 


Job satisfaction: research on, 148 

Journalism education: study of, 341; 
study of ratio of professional to liberal- 
arts content, 376 

Junior colleges: case study, 317; charac- 
teristics of, as branch of state univer- 
sity vs. state junior college, 389; en- 
rollment in, 286; public, yearbook on, 
390; studies of, 389; studies of transfer 
students in, 342; training of technicians 
in, 337 

Junior-college expansion to four-year: cur- 
riculum changes accompanying, 374; 
prevalence of, 374 

Junior high school: effectiveness of 
orientation program in, 161; housing of, 
68; organization and staff of, 68 

Juvenile delinquency: as a subculture, 8 

Juvenile delinquents: attitudes toward 
legal authorities, 6; curriculum organi- 
zation for, 35 


Kendall’s tau: discussion of, 458 
Kendall’s W: alternative measure of, 458 
Kerr-Speroff Empathy Test: use with re- 
habilitation counselor trainees, 117 
Kindergarten: advantages of, 264 
Kindergarten-primary program, 
grated: experimentation with, 231 
Kindergarten room: developmental tasks 
as a basis for planning, 262 
Knowledge, sociology of: discussion of, 
428 
Kolmogoroy-Smirnov test: discussion of, 


inte- 


Kruskal-Wallace test: improved beta ap- 
proximation to, 457 

Kuder Preference Record: effect on self- 
evaluation of interests, 170; interpreta- 
tion of, 142; use on college science 
majors, 143; use with guidance work- 
ers, 118; use to predict success in voca- 
tional school, 143 








Kulman-Anderson Test: and measures of 
social power, 502 


Laboratory instruction: 

method of, 353 
e laboratories: discussion of, 517 

Language study: stress on, 28 

Large-group teaching: discussion of, 250; 
new pattern of, 266; in Newton Plan, 
24 

Large schools: advantages and disadvan- 
tages of, 212 

Latent class analysis: determinant meth- 
ods for, 470 

Latent profile analysis: comparison with 
other methods, 470 

Latent structure analysis: 
with other methods, 470 

Latin square: discussion of, 430; missing 
or mixed-up observations in, 433; opti- 
mality properties of, 436; for repeated- 
measurements experiments, 431 

Law education: study of, 341 

Law students: comparison with medical 
students, 313 

Lay co-operation: in curriculum develop- 
ment, 261 

Laymen: influence of, on secondary edu- 
cation, 28 

Leaderless role-playing: effectiveness of, 
compared with leaderless group discus- 
sion, 163 

Leadership: in curriculum development, 
258 


experimental 


comparison 


Leadership in student activities: in rela- 
tion to student-body size, 362 
Leadership techniques: of principals, 73 
Leadership-training course: effects of free 
group discussion in, 163 
Learner-centeredness: instrument to as- 


499 
: and behavioral goals, 218; ef- 

fect of direction in, 250; evaluation of, 
52; guidance of, 50; laws of, applied 
in classroom, 52 

Learning curve: development of, with in- 
strumentation, 513 

Learning functions: mathematical models 
of, 416 

Learning or growth parameters: estima- 
tion of, 433 

Learning, theory of: implications for 
teaching machines, 252 

Learning and thinking processes: mecha- 
nized or programed models of, 416 

Lecture method vs. project method: study 
of, 249 

Lecture vs. discussion: studies of, 52, 351 

Legal problems: of health service, 363 

Legal status and role: of different levels of 
college hierarchy, 389 

Liberal arts: in teacher education, 238 
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Liberal-arts colleges: enrollment in, 286; 
history of, 336; nondenominational, and 
authoritarianism, 316; private organiza- 
tion and operation of, 388 

Liberal-arts curriculum: professional con- 
tent in, 377; for technical and profes- 
sional students, 237 

Liberal education: bibliographies on, 336; 
effects of, as personified by faculty’s 
student nominees, 328; studies of, 334; 
studies of aims and methods of, 341 

Library resources: for educational re- 
search, 487 

Library-science education: studies of, 341 

Life Experience Inventory: use in college 
prediction, 305 

Limits: place of, in counseling, 132 

Linear algebra: programs for computers 
using, 527 

Linear models, general: estimation pro- 
cedures in, 434 

Linear programing: in operations re- 
search, 528; techniques for regression 
analysis, 463 . 

Linear-regression equation: discussion of, 
462 

Literature retrieval: by machine, 490 

Literature searches, mechanical: discus- 
sion of, 490 

Literature, world: in junior high-school 
curriculum, 235 

Loans, student: importance of, 290; in- 
formation on, 363 

Local control: of curriculum, 190; of edu- 
cation, 240 

Logic: and views of teaching, 252 

Logical analysis: application to social- 
science concepts, 428 

Longitudinal study: evaluation of guid- 
ance services, 168; of college students, 
292, 327, 329 

Low-ability students:. effect of a large 
class on, 250 


Magazines: reports on education in, 488 

Maladjusted children: and domineering 
father, 15 

Manifest Anxiety Scale: use in college 
prediction, 304 

Mann-Whitney U statistic: extension of 
and tables for, 455; study of, 456 

Marriages, teen-age: studies of, 7, 209 

Married students: housing of, 361 

Mass media: in relation to teaching proc- 
ess, 213 

Matched samples: tests for use with, 457 

Matching: vs. randomization in studies of 
counseling, 168 

Matching groups designs: and analysis of 
covariance, 435 

Mathematical models: in economics, man- 
agement science, and psychology, 416; 
in educational research, 416 
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Mathematics: continuity in curriculum of, 
343; education in foreign countries, bib- 
liography on, 490; high-school enroll- 
ment in, 37; high-school programs in, 
40; improvement of instruction in, 205; 
instruction, improvement of, 70; mod- 
ern, book on, 442; new curriculums in, 
227, 236; summer program in, 344 

Matrix inversion: used in analysis of cor- 
relational data, 462 

Means, ranking of, for two normal popu- 
lations, variances unknown: methods of, 
448 


Mean-square contingency, coefficients of: 
probabilistic interpretations of, 451 

Measurement and scaling, theory of: 
books on, 441 

Measurement, scale of: and use of non- 
‘parametric methods, 455 

Median: confidence intervals for, 455 

Medical curriculum, new: report on, 239 

Medical education: sociology of, 313; 
studies of, 341 

Medical school: general education as pre- 
requisite, 335 

Medical students: characteristics of, 288 

Mental development: of adolescents, 14 

Mental health: and the school program, 
210; curriculum materials for teaching, 
211; effect of instruction in high school 
on, 50 

Mental-hygiene procedures: in group set- 
tings with school-age children, 164 

Mental measurements: reference guide to, 


Merit pay: discussion of, 77 

Metabolic rate: and growth rate, 13 

“Methods courses”: near-obliteration of, 
238; timing of, 239 

Viller Analogies Test: and counselor ef- 
fectiveness, 118; use with rehabilitation 
counselor trainees, 117 

Vinnesota Multiphasic Personality Inven- 
tory: characteristics of college dropouts 
on, 322; comparison of college seniors 
with freshmen on, 329; comparison of 
scores of college students and parents 
on, 143; and counselor effectiveness, 
118; and early marriage, 7; relation to 
other variables, 249; scores of delin- 
quent vs. nondelinquent boys on, 144; 
in study of classroom variables, 508; 
use in college prediction, 304; use with 
rehabilitation counselor trainees, 117 

Minnesota Teacher Attitude Inventory: re- 
lation to classroom behavior in student 
teachers, 326; relation to other vari- 
ables, 249; in study of classroom vari- 
ables, 508; use to study teacher-pupil 
rapport, 506; use with student teachers, 
239 

Minority groups: authoritarian attitudes 
of, 315 
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Misbehavior: causes of, 248; relation to 
punishment used, 248 

Mobility: effects on school turnover, 209 

Model building, mathematical and me- 
chanical: applications to educational re- 
research, 415 

Monteith College: program of, 238 

Mooney Problem Check List: and early 
marriages, 7; as criterion of effective- 
ness of college orientation course, 161 

Moral issues: in experimentation, 422 

Morale: assessment of faculty, 378; dis- 
cussion of staff, 77 ~ 

Mother-daughter relationship: investiga- 
tion of, 7 

Mothers: effect of television on attitudes 
toward children, 162; effect on sons’ 
reading growth, 162 

Mothers’ employment: effects on children, 
210 

Motion pictures: effectiveness of, in re- 
ducing frustration, 164; use in teaching 
physics, 51 

Motivation: discussion of, 218; questioned 
as explanatory construct, 218; relation 
to changes in IQ, 324; of student, and 
teaching methods, 250; by teacher com- 
ments on tests, 250; theory of, 49 

Moving averages: graphical approach to 
calculation of, 462 

Multi-age plan: use of, 264 

Multi-grade plan: experimentation with, 
231; use of, 264 

Multinomial distribution: discussion of, 


45: 

Multiple-choice items: in teaching-ma- 
chine programs, 516 

Multiple correlation: coefficients, confi- 
dence intervals for, 460; easy method of 
computation, 462; graphical method of 
solution, 462; use of, in college predic- 
tion, 301 

Multiple counseling: research on, 158 

Multiple-criterion approach: use to ana- 
lyze instruction, 496 

Multivariate analysis: estimates of para- 
meters in, 459; textbooks in, 441; use 
to study counselee-counselor interaction, 
136 

Multivariate confidence bounds: estima- 
tion of, 460 

Multivariate two-sample problem: non- 
parametric randomization tests of, 458 

Music: facilities for, 263; objectives of, 25 


National Commission on Accreditation: 
development of, 386 

National curriculum: discussion of, 229 

National Defense Education Act: and 
guidance services, 105; recognition of 
counselor education, 123 

Need for achievement: relation to prefer- 
ence for independent study, 356 








Negro college students: success in North- 
ern colleges, 302 

Negro college women: 
choices of, 151 

Negro employment: changes in, 153 

Neuromuscular organisms: mechanized 
model of, 416 

Nomographs: discussion of, 450 

Noncentral ¢ distribution: approximation 
of, 450 

Noncentral variance ratios: use in analysis 
of variance, 433 

Nonclass activities: cost of, 60; discussion 
of, 57 

Nondirective teaching: effects of, 352; re- 
lation of prevalence in college to pro- 
duction of doctoral students, 353 

Nongraded elementary-school plan: dis- 
cussion of, 231; use of, 264 

Nonintellective predictors of college aca- 
demic success: effectiveness of, 302 

Nonintellectual characteristics of stu- 
dents: college selectivity in, 290 

Nonlinear models: discussion of, 432 

Non-normal populations: effect on t-test, 


occupational 


Non-normality: effects on F distributions, 


Non-null cases in theory testing: discus- 
sion of, 442 

Nonparametric statistics: developments in, 
455; textbocks in, 441 

Nonparametric techniques in experimental 
design: discussion of, 435 

Nonparametric tests: generalized effi- 
ciency measure for, 456 , 

Normal distribution: table of ‘frequencies 
of, relative to selected class intervals 
and sample sizes, 450 

Normative data, institutional: need for, in 
evaluation, 372 

Normit analysis: tables for estimating 
normal distribution, 450 

Novelty facter: in teaching-machine learn- 
ing, 518 

Numerical analysis: discussion of, 526 

Nursing education: balance of pro. 
sional and general-education content in, 
376; eradication of dichotomy between 
liberal and professional studies, 335; in 
junior and community colleges, 337; re- 
finement of programs in, 239 


Obesity: causes of, 14 

Objectives: of secondary education, 23; 
sources of, 

Objectives, curriculum: discussion of, 
230; of secondary education, 233; use- 
fulness of, 191 

Objectives,-institution’s: use in its evalua- 
tion, 372 

Qbservation, classroom: focal points. for, 
253 
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Observation schedule, teacher: develop- 
ment of, 504, 506 

Occupational aspiration: of adolescents, 
18; determinants of, 150; relation to 
social status, 290; and socioeconomic 
class, 10, 17 

Occupational classification: research on, 
149 

Occupational course: description of, 153; 
research on, 160 

Occupational information: materials, uses 
of, 153; research on, 148; sources of, 
152 

Occupational mobility: of persons of dif- 
ferent racial extraction, 149 

Occupational roles: of women, 15] 

Occupations: social status of, 150 

One-tail vs. two-tail tests: discussion of, 
449 

Operations research: discussion of, 415, 
] 

Order statistics: discussion of, 447 

Organization: of guidance services, 105; 
of secondary school, 34, 67 

Organization, school: and 
234; research in, 212 

Orientation: of new teachers, 76 

Orientation activities; and effectiveness of 
college, 161 

Orientation to college: programs for, 364 

Overachievement: as college career type, 
313; and study habits, 314 

Overpermissiveness: effects of, in child 
rearing, 207 a 


curriculum, 


Paired comparison: test for bias in, 457 

Parameters, estimation of: maximum like- 
lihood methods for, 446 

Parametric statistics: developments in, 
442 

Parent-teacher-pupil conference: evalua- 
tion of, 232 

Parents: attitude of adolescents toward, 
6; group procedures with, 162, 164; 
relationship to adolescents, 7, 15 

Partial correlation: coefficients, related 
to tau, 458; easy method of computa- 
tion of, 462 

Participation: limits on, in nonclass ac- 
tivities, 60; in nonclass activities, 60 

Path analysis: treatment of, 461 

Patriotism: school emphasis on, 24 

Pattern of response: designs for investi- 
gating values of, 432 

Pay: extra, for extra work of teachers, 62 

Peer culture: as a factor in students’ col- 
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Probability-density functions: optimal 
weighting function for smoothing of, 
462 


effectiveness 


Problem-solving activity: simulation of 
methodological approaches to, 
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Replication of investigations: plans for 
full and fractional, 431 

Research: as major occupation of doctoral 
graduates, 339; bibliographies of, 490; 
federal support of, 341; fostering of, 
in small college, 380; role of, in educa- 
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sampling and erroneous decision, 453 

Sampling, double: discussion of, 445 

Sampling procedures: discussion of, 445 

Scaling, psychological: books on, 441 

Scheduling, flexible: experiments in, 50 

Scholarly productivity of colleges: and 
personality characteristics of students, 
292; relation of index of, to attraction 
of gifted applicants, 313 

Scholarship winners: personality charac- 
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Secondary-school teachers: training of, in 
liberal arts, 239 

Selection: of college applicants, studies 
of, 298; of guidance workers, 115; pro- 
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Self-directed class: comparison with con- 
ventional, 352 

Self-concept: and vocational choice, 100 
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247; critical-incident approach to meas- 
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